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ABSTRACT 


Guidance  laws  are  developed  for  tactical 
missiles  which  take  into  account  the  following  important 
dynamic  and  random  effects:  random  target  motion, 
homing  sensor  measurement  noise,  bounded  control 
level,  bounded  acceleration  level,  and  missile  autopilot 
dynamics.  Several  different  guidance  laws  are  derived 
using  optimal  stochastic  control  theory  and  evaluated 
by  computer  simulation.  An  important  conclusion  of 
this  work  is  that  when  intercept  accuracy  is  appreciably 
limited  by  missile  maneuvering  capability,  a  control 
policy  obtained  by  taking  control  saturation  into  account 
can  yield  significantly  better  performance  than  control 
policies  derived  assuming  that  control  levels  are  un¬ 
constrained. 
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1. 


INTRODUCTION 


1.1  BACKGROUND  AND  OBJECTIVES 

The  task  of  guiding  a  tactical  missile  to  a  target  is  affected  by  a 
number  of  factors  and  constraints  —  e.g.,  target  maneuvering  capability, 
homing  sensor  measurement  errors,  missile  autopilot  dynamics,  bounded 
control  variables,  limited  missile  maneuvering  capability,  and  launch  ini¬ 
tial  conditions.  To  overcome  those  effects,  a  number  of  guidance  tech¬ 
niques  have  been  developed  and  evaluated  (Refs.  1, 2, 3, 4, 5, 6).  Heretofore, 
most  guidance  laws  have  been  derived  assuming  fairly  simple  mathematical 
models  of  the  missile-target  engagement  problem.  A  familiar  example  is 
so-called  proportional  guidance  which  is  designed  primarily  for  constant 
velocity  targets  and  unconstrained  missile  controls.  It  is  frequently  found 
that  guidance  laws  derived  in  this  fashion  yield  terminal  miss  distances  that 
are  unacceptable  when  applied  in  situations  where  target  maneuvers,  etc., 
exist.  Consequently  one  is  motivated  to  obtain  improved  performance  by 
including  within  the  guidance  problem  formation  more  of  those  factors 
which  affect  the  missile's  interception  capability. 

hi  Ref.  6  a  number  of  guidance  laws  which  offer  improvements 
over  conventional  proportional  guidance  are  evaluated.  These  laws  are 
derived  with  the  aid  of  optimal  control  theory  from  mathematical  models 
that  include  the  effects  of  initial  condition  errors,  missile  airframe  dynam¬ 
ics,  constant  target  acceleration  and  a  penalty  on  the  amount  of  control 
effort  consumed.  This  report  represents  a  continuation  of  that  effort; 
guidance  laws  are  developed  which  include  the  effects  of  measurement 
noise,  bounded  control  levels,  bounded  maneuvering  acceleration  level  and 
random  time -varying  target  maneuvers.  Emphasis  is  placed  upon  those 


1-1 


THE  ANALYTIC  SCIENCES  CORPORATION 


techniques  which  can  potentially  be  applied  in  practical  tactical  missile 
weapons  systems  in  the  next  ten  to  twenty  years,  especially  those  which 
can  take  advantage  of  the  rapid  improvement  in  computer  hardware  tech¬ 
nology. 


In  Chapter  2  guidance  laws  are  derived  using  some  results  from 
optimal  stochastic  control  theory  described  in  Appendix  A,  which  account 
for  measurement  noise,  random  target  acceleration,  and  bounded  missile 
control  variables.  Performance  results  for  these  laws,  obtained  by  com¬ 
puter  simulation,  are  presented  in  Chapter  3.  In  addition,  an  acceleration 
limiting  technique  is  developed  and  evaluated  in  Chapter  3,  its  purpose 
being  to  prevent  the  missile  lateral  acceleration  from  exceeding  pre¬ 
scribed  limits.  A  summary  of  the  results  and  major  conclusions  are 
given  in  Chapter  4. 
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2.  OPTIMAL  STOCHASTIC  GUIDANCE  LAWS 


In  this  chapter  stochastic  guidance  problems  for  a  tactical  mis¬ 
sile,  including  the  effects  of  bounded  control  variables  and  sensor  measure¬ 
ment  errors,  are  formulated  and  solved.  First  a  mathematical  model  is 
developed  which  provides  a  standard  description  of  the  guidance  system 
dynamics  for  use  throughout  the  report.  Then  both  optimal  and  sub  optimal 
nonlinear  guidance  laws  are  derived;  the  performance  of  each  law  is  sub¬ 
sequently  evaluated  in  Chapter  3  by  digital  computer  simulations  of  the  sys¬ 
tem  model. 


2.1  PROBLEM  FORMULATION 

The  equations  of  motion  for  the  missile  guidance  problem  are 
derived  assuming  motion  is  confined  to  a  single  plane  and  neglecting  forces 
caused  by  gravity  and  aerodynamic  drag.  *  Referring  to  Fig.  2. 1-1,  a  non¬ 
rotating  orthogonal  coordinate  system  is  defined  with  the  X-axis  chosen 
along  the  line -of -sight  (LOS)  between  the  interceptor  and  the  target  at  the 
beginning  of  the  engagement.  The  center  of  the  coordinate  system  moves 
with  the  target  but  the  coordinate  axes  do  not  rotate. 


* 

In  actual  applications,  drag  can  significantly  reduce  the  airspeed  of 
a  coasting  missile,  thereby  adversely  affecting  guidance  accuracy. 

The  exclusion  of  this  effect  here  is  justified  on  the  basis  that  we  are 
seeking  guidance  law  design  criteria  that  offer  improvement  over 
conventional  methods  with  respect  to  more  significant  guidance  error 
sources.  However,  a  more  complete  system  evaluation  of  the  methods 
resulting  from  this  study  would  certainly  include  aerodynamic  forces, 
as  well  as  other  factors  neglected  in  this  simplified  investigation. 
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K-t03i 


Figure  2.1-1  Relative  Coordinate  System 

If  the  guidance  system  works  well,  a  reasonable  conjecture  is 
that  the  LOS  rotates  very  little  along  the  missile's  trajectory,  except  near 
the  end  when  the  range  becomes  small  (less  than  100  feet).  This  assump¬ 
tion  is  suggested  by  the  similarity  between  optimal  linear  deterministic 
guidance  laws  and  conventional  proportional  guidance  in  that  all  such  tech¬ 
niques  tend  to  achieve  a  small  LOS  angular  rate  (Refs.  5, 6).  Consequently 
at  the  terminal  time  t^,  the  missile  trajectory  intersects  the  y-axis  in 
Fig.  2. 1-1  almost  perpendicularly  and  the  terminal  miss  distance  is 
approximately  y(tf).  Therefore,  the  missile's  motion  parallel  to  the  y-axis 
is  of  primary  interest. 

We  shall  assume  that  the  control  variable  available  for  the  guid¬ 
ance  law  is  the  output  u(t)  of  the  missile's  control  actuation  mechanism  — 
e.  g. ,  a  control  surface  deflection.  *  The  latter, acting  through  the  missile 

♦ 

This  assumption  neglects  actuator  dynamics  which  typically  have  much 
foster  response  characteristics  than  the  missile  rotational  dynamics. 
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rotational  dynamics,  provides  an  acceleration  vector  a  that  changes  the 

interceptor’s  flight  path,  hi  a  nonthrusting  drag-free  vehicle,  a  is 

— m 

approximately  perpendicular  to  the  missile’s  velocity  as  indicated  in 
Fig.  2. 1-1,  Only  the  y-component  of  a  ,  given  by 

am  =-am(t)cose  (2.1-1) 

y 

W  ere  am(t)  =  |am(t)|,  is  important  for  controlling  terminal  miss.  If 
the  ox  mentation  of  vm  is  assumed  to  be  slowly  varying,  cos  Scan  be  treated 
as  a  known  scale  factor;  throughout  this  discussion  we  assume  cos  0  =  1. 

In  many  applications  the  missile  rotational  equations  of  motion 
can  be  modeled  as  being  linear;  therefore  they  can  be  written  in  state 
variable  form  as 


am(t)  =  iyt)tdu(t)  (2.1-2) 

where  the  acceleration  is  regarded  as  an  output  variable  that  in  general 
can  be  a  fuuctio*  of  both  the  state  xm(t)  and  the  control  u(t).*  In  this  re- 
P01*  Fm»  £m>  £m>  and  d  are  assumed  to  be  constant  arrays,  a  condition 
that  needs  some  elaboration.  In  many  applications  missile  dynamic  char¬ 
acteristics  vary  rapidly  because  of  changing  flight  conditions,  especially 
when  thrusting  at  a  high  g-level.  In  these  situations,  the  parameters  in 
Eq.  (2. 1-2) may  be  treated  as  constant  if  an  adaptive  autopilot  has  been  de¬ 
signed  which  maintains  known,  uniform  dynamic  characteristics. 


For  example,  in  a  tail-controlled  lifting  vehicle  u(t)  can  represent 
the  control  surface  deflection  which  contributes  directly  to  missile 
lateral  acceleration. 
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Alternatively,  when  the  elements  of  Fm,  etc.,  are  time-varying,  they  can 
be  estimated  on-line  by  a  parameter  identification  technique  which  can 
track  time-varying  parameters.  In  either  case,  the  subsequent  develop¬ 
ment  has  application;  however  we  shall  see  that  if  the  airframe  param¬ 
eters  are  known  a  priori,  certain  feedback  control  gains  can  be  determined 
off-line  and  stored  in  the  guidance  computer,  if  the  parameters  are  iden¬ 
tified  in  flight,  then  the  control  gains  must  be  calculated  on-line. 

In  addition  to  the  missile's  acceleration,  the  target  acceleration 
at(t)  has  an  effect  on  the  guidance  dynamics.  In  particular,  from  Fig.  2. 1-1 
it  follows  that 

y(t)  =  a.  (t)  +  a  (t)  (2.1-3) 

y  y 

where  aty(t)  is  the  component  of  at  along  the  y-axis.  We  shall  assume  that 
the  target  accelerates  randomly  according  to  the  relations 

xt  =  Ftxt(t)  +  wt(t) 
at  W  = 

where  w  .(t)  is  a  gaussian  white  noise  process  having  statistics  described 

.  * 
by 

E^tt)}  =  0 

E  {wt(t)wt(r)T}  =  <^8(t-r)  (2.1-5) 

* 

A  nonzero,  known  mean  can  readily  be  included  in  the  development. 


(2.1-4) 


1 
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and  Ft  and  are  known  constant  arrays.  The  matrix  Qt  is  constant 
and  positive  semidefinite,  and  6(t  -r)  is  the  unit  impulse  function.  This 
model  represents  the  target's  acceleration  as  the  output  of  a  linear 
system  driven  by  white  noise.  It  is  a  good  representation  insofar 
as  the  target  maneuvers  appear  to  be  random  and  correlated  in 

time.  The  correlation  characteristics  of  a^.  (t),  which  determine 

«# 

the  extent  to  which  the  target’s  future  maneuvers  can  be  predicted 
from  previous  measurements  of  target  motion,  are  determined  by 


Random  maneuvers  are  frequently  used  by  aircraft  flying  in  a 
region  where  they  are  subject  to  attack  by  missiles,  especially  by  surface- 
to-air  missiles  (SAM's)  (Ref.  7).  The  pilot's  purpose  is  to  prevent  SAM 
radar  trackers  from  acquiring  a  fix  on  the  aircraft.  However,  if  the  pilot 
knows  a  SAM  has  been  launched,  he  is  more  likely  to  employ  one  of  several 
deterministic-type  maneuvers  which  have  been  historically  successful  in 
avoiding  intercepts.  To  analyze  the  latter  situation,  game  theory  may  allow 
a  more  realistic  problem  formulation  in  that  the  target  aircraft  can  be 
modeled  as  an  intelligent  evader  whose  objective  is  to  maximize  the  ter¬ 
minal  miss  distance.  In  this  report  only  random  target  motion  is  con¬ 
sidered;  the  application  of  game  theory  is  an  important  topic  for  future 
investigation. 

Combining  Eqs.  (2,1-2),  (2.1-3),  and  (2. 1-4)  the  complete  set 
of  state  equations  for  the  guidance  problem  can  be  written  as* 


*  r  1 

The  symbols  0  and  [OJ  denote  respectively  a  vector  and  a  matrix 
having  all  elements  zero. 


2-5 


THE  ANALYTIC  SCIENCES  CORPORATION 


y(t) 

T  T 

0  1  0A  0 

d 

y(t) 

T  T 

o  o  c:  -cA 
— t  —  m 

dt 

xt(t) 

0  0  Ft  [0] 

^  1 

a 

0  0  [0]  Fm 

■ 

y(t) 

0 

■  o 

0 

y(t) 

+ 

-d 

u(t)  + 

0 

*t(t) 

0 

wt(t) 

V>. 

^m 
■  ■ 

0 

(2.1-6) 


or  more  compactly  as 

x(t)  =  Fx(t)  +  gu(t)  +  w(t) 


(2.1-7) 


where 


x(t)  - 


'  y(t)  * 

*  o  * 

y(t) 

;  w(t)  - 

0 

*t(t) 

wt(t) 

x(t) 
— m 

0 

•  ■ 

E{w(t)  w(r)T|  -  Q  6(t  -  t)  = 


T  T 

0  0  0A  0A 
T  T 

0  0  0A  0 


0  0  Qj.  [o] 
0  0  [0]  [0] 


6(t-r)  (2.1-8) 


and  F  and  g  are  identified  as  the  matrix  and  vector  coefficients  of  x(t)  and 
u(t)  respectively,  in  Eq.  (2. 1-6).  The  initial  value  of  x(t)  is  assumed  to 
be  a  vector  gaussian  random  variable  with  known  statisties  given  by 

The  notation  0  and  [0]  denotes  respectively  a  vector  and  a  matrix 
having  all  zero  elements. 
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\ 

\ 

\ 


E  {x(0)}  = 

\ 

i 

E {t 3c(0)  -  i*][x(0)  -  if]T| 


The  objective  in  designing  the  guidance  system  is  to  derive  a 
feedback  control  law  for  u(t)  having  performance  that  is  optimum  in  some 
sense.  In  order  to  provide  feedback,  measurements  related  to  the  ele¬ 
ments  of  x(t)  must  be  available.  It  is  usually  realistic  to  assume  that  mea¬ 
surements  of  line-of-sight  angle  or  angle  rate  are  available  from  a  homing 
sensor.  From  Fig.  2.1-1,  the  LOS  angle  X(t)  is  given  approximately  by 


(2.1-9) 


where  tf  is  the  terminal  time  and  vc  is  the  magnitude  of  the  closing  velocity 
which  is  assumed  to  be  constant.  If  Eq.  (2. 1-9)  is  differentiated  with  respect 
to  time,  the  result  is 


_x£L+M 

.M  M. 


(2.1-10) 


Consequently,  within  the  limits  of  the  approximation  stated  in  Eq.(2.1-9),  an 
LOS  rate  measurement  is  linearly  related  to  the  state  variables  y(t)  and  jr(t). 
hi  addition,  linear  measurements  of  some  of  the  missile  airframe  state  vari¬ 
ables  x m(t)  (pitch  rate,  lateral  acceleration,  etc.)  are  also  generally  avail¬ 
able.  The  set  of  all  these  measurements,  z(t),  is  considered  to  be  available 
at  discrete  times  t\  and  corrupted  by  additive  gaussian  noise;  thus  z(tj)  can 
be  expressed  as 

zj^)  =  =  HjX^  +  Vj  (2.1-11) 
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where 


Hi  = 


iMtf-ti)2  l/vc(t,-ti)  0T  0T 

0  0  [0]  H 


m 


E  {v^  =  0 


E 


Ri  ;  i  =  j 


0  ;  i  5^  j 


(2.1-12) 


and  Rj  is  a  positive  definite  matrix  that  can  vary  with  time.  The  quantity 
Xj  denotes  x(tj)  in  Eq.  (2.1-7)  and  the  matrix  Hm  describes  the  linear  rela¬ 
tion  between  the  missile  airframe  state  variables  and  the  measurements. 

hi  situations  where  the  homing  sensor  output  is  interpreted  as  an 
LOS  angle,  Hj  takes  the  form 

l/vc<‘f-tl)  0  0T  0T 

0  0  [0]  Hm 

m 

In  this  report  we  use  Eq.  (2. 1-12)  as  the  sensor  model. 


In  order  to  guide  the  missile,  the  measurement  data  is  to  be 
used  for  computing  control  commands.  To  allow  for  the  time  required 
to  make  the  necessary  calculations,  it  is  assumed  that  a  new  value  of  the 
control  can  be  computed  only  at  each  measurement  time  tj.  Thus  on  the 
interval  tj  s  t  *  tJ+1,  u(t)  is  held  constant  at  the  value  of  u(tj).  Because 
both  the  measurements  and  the  controls  are  generated  at  discrete  points 
in  time,  we  use  the  discrete  equivalent  of  Eq.  (2. 1-7): 

u.  -  u(ti) ;  xt  -  x(tj) 


5i+l 


♦i*i 


+  li  ui  +  ^i 
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where  ^  y  p  and  wi  are  determined  from  F,  g,  and  w(t)  according  to 
Eqs.  (A.  1-12)  and  (A.  1-13)  of  Appendix  A.  For  this  application  the  inter* 
val  between  measurements  is  assumed  to  be  of  uniform  length,  At, 


At  ■  wv 


i  =  0, 1,...,  N-l 


with  t^  =  tf.  Therefore,  because  the  dynamics  in  Eq.  (2. 1-7)  and  the  sta¬ 
tistics  of  w(t)  are  constant,  Eqs.  (2.1-13),  (A.  1-12)  and  (A.  1-13)  can  be 


written  as 


*  *  eF* 


=  v  R  reF<At-T)sdr 

*r\ 


*5i+Zui  +?i> 


i  =  0,1,...,  N-l  (2.1-14) 


where  w.  is  a  gaussian  random  sequence  satisfying 


■W  =  * 


E{w,w^}  =  Qd  -  $  e 


[AteF(At-T)Q(eF(At'T))T 
0  ' 


fdr 


EfeH  =  0  i  i  *  j 


(2.1-15) 


Having  Eqs.  (2. 1-11)  and  (2. 1-14)  describing  the  measurement 
sequence  and  the  discrete  time  dynamics,  we  desire  to  establish  rational 
performance  criteria  for  determining  each  control  uj.  In  many  applications 
the  most  important  objective  is  that  the  terminal  miss  distance  be  made  as 
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small  as  possible.  More  precisely,  if  we  define  a  loss  function  of  the  ter¬ 
minal  miss  distance  f(x^(tf)),  where  is  the  first  element  of  the  state  vec¬ 
tor  x  —  i.  e. ,  the  quantity  y  in  Fig.  2. 1-1  --  and  tf  is  the  terminal  time, 
then  we  say  that  guidance  performance  is  optimized  if  the  index 

J  -  E^jttf))} 

is  minimized.  The  designer's  objective  is  to  determine  the  sequence  of 
optimal  control  commands  which  accomplishes  this  goal,  hi  this  report  the 
loss  function  used  is  the  square  of  the  miss  distance  so  that  the  perform¬ 
ance  index  becomes 

Jx  =  Ejxjftf)2}  (2.1-16) 

In  practical  applications  the  allowable  values  of  the  control  are  bounded  in 
magnitude;  typically  for  tactical  missiles  the  control  surface  deflection  is 
limited  to  a  few  degrees.  Thus,  our  objective  is  to  minimize  Jj  subject 
to  the  constraint* 

ju.|  *D;  i  =  0,1,..., N-l  (2.1-17) 

It  is  subsequently  demonstrated  that  this  problem  formulation  leads  to  an 
optimal  nonlinear  stochastic  control  law;  i.e.,  Uj  is  a  nonlinear  function 
of  past  measurements. 


* 

Depending  upon  the  type  of  control  actuation  mechanism  in  use, 
it  may  be  desirable  to  restrict  other  variables  as  well,  say 
du/dt.  Such  a  requirement  can  complicate  the  task  of  finding 
the  optimal  control  law  and  the  designer  may  have  to  settle  for  a 
suboptimal  law  that  satisfies  the  constraints  but  which  does  not 
exactly  minimize  Jj . 
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If  the  energy  expended  by  the  control  law  is  of  no  importance, 
as  may  be  the  case  in  short  range  missions  or  near  die  end  of  a  long  range 
mission,  the  above  problem  formulation  is  quite  realistic.  Its  solution 
will  indicate  the  ultimate  guidance  accuracy  that  can  be  achieved  for  a 
given  target  engagement  situation  when  control  level  is  bounded.  However, 
for  comparison  purposes  it  is  convenient  to  consider  alternative  perform¬ 
ance  criteria  that  have  been  advocated  for  the  guidance  problem. 

In  (Ref.  6)  linear  guidance  laws  for  a  continuous-time,  deter¬ 
ministic  problem  formulation  are  evaluated.  These  laws  are  chosen  to 
minimize  a  quadratic  performance  index  of  the  form 


J  =  x1(tf)2+rj  u(t)2  dt 


(2.1-18) 


for  various  models  of  the  missile  autopilot  dynamics  and  target  maneuver¬ 
ing  capability,  but  without  any  direct  constraint  on  (u(t)| .  This  performance 
criterion  has  one  advantage  over  that  outlined  above,  in  that  the  presence 
of  u(t)  in  the  definition  of  J  results  in  a  guidance  law  that  tends  to  con¬ 
serve  missile  energy.  *  However,  it  lacks  a  capability  for  directly 
* 

This  statement  must  be  qualified  with  respect  to  the  type  of  energy 
consumption  one  is  talking  about.  If  the  control  surface  actuator  is 
electromagnetic,  a  constant  electric  current  must  be  provided  to  main¬ 
tain  a  constant  control  surface  deflection  and  j*  u(t)2  dt  is  proportional 
to  the  electrical  energy  consumed.  However,  in  electrohydraulic  sys¬ 
tems.  power  is  required  only  when  the  control  surface  is  in  motion  so 
that  j  u(t)2  dt  is  a  better  measure  of  energy.  In  addition,  some  sys¬ 
tems  pump  hydraulic  fluid  into  the  atmosphere;  in  this  case  J'|u(t)|dt 
represents  the  amount  of  fluid  expended.  Besides  actuator  energy/fluid 
losses,  the  missile  incurs  a  kinetic  energy  loss  proportional  to  J|a(t)  |  dt 
when  it  performs  a  maneuver  at  constant  altitude.  Although  J  in  Eq. 
(2.1-18)  is  directly  related  only  to  energy  used  by  the  electromagnetic 
type  of  actuator  it  is  frequently  observed  that  utilizing  a  penalty  on  the 
integral  of  u(t)2  produces  a  control  law  that  also  tends  to  limit  all  of  the 
other  losses  mentioned  above.  Therefore  we  are  qualitatively  correct  in 
saying  that  minimization  of  J  in  Eq.  (2.1-18)  tends  to  conserve  missile 
energy. 
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i 
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constraining  control  magnitude.  Therefore  the  control  levels  called  for 
can  exceed  any  limit  which  may  exist;  this  condition  tends  to  occur  most 
frequently  near  the  expected  time  of  intercept  when  the  observed  line-of- 
sight  angular  rate  tends  to  become  large. 

By  analogy  with  Eq.  (2. 1-18),  in  this  study  we  investigate  a 
performance  index  having  a  quadratic  penalty  on  control  level,  to  be  com¬ 
pared  with  Jj  in  Eq.  (2. 1-16).  Namely,  we  seek  those  unconstrained  con¬ 
trols  u.  which  minimize  the  index 

J2  =  E  +  r  y;1  v*  |  (2.1-19) 

This  design  criterion  ordinarily  leads  to  a  linear  stochastic  control  law; 
i.e.,  Uj  is  a  linear  function  of  the  measurements.  However,  because  the 
actual  missile  control  capability  is  constrained  according  to  Eq.  (2.1-17), 
the  control  sequence  obtained  by  minimizing  J2  is  "clipped”  when  applied 
in  the  actual  guidance  system  resulting  in  a  suboptimal  nonlinear  stochastic 
guidance  law. 

The  solutions  to  the  above  two  guidance  problems  are  given  in 
the  next  two  sections. 

2.2  OPTIMAL  NONLINEAR  STOCHASTIC  GUIDANCE  LAW 

The  optimal  stochastic  guidance  problem  associated  with  Eqs. 
(2.1-18)  and  (2.1-17)  is  summarized  as  follows: 

Given  the  linear  discrete  time  dynamic  relations 

xi+1  =  ♦Xj  +  ru. +Wj  (2.2-1) 

with  linear  measurements 

zt  =  H.xi  +  vi  (2.2-2) 
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*  0 

determine  the  optimal  sequence  of  controls  uj 
(i= 0, 1, . . . ,  N-l)  such  that  the  performance  index 

Jx  =  E  jxj(tj)2|  (2.2-3) 

is  minimized,  subject  to  the  constraint 

Juj!  s  D  ;  for  alii  (2.2-4) 

Definitions  of  the  quantities  4>,  y,  H.,  and  v.  are  available  in  Eqs. 
(2.1-12),  (2.1-14)  and  (2.1-15).  The  above  problem  formulation  is 
a  discrete-time  generalization  of  the  case  treated  by  Nahi  and  Sworder 
(Ref.  8);  the  latter  is  a  continuous  time  problem  which  does  not  take  into 
account  target  or  autopilot  dynamics.  Fortunatf  ly  the  optimal  guidance 
law  is  readily  obtained  as  described  in  Appendix  A.  Its  mechanization  can 
be  described  a-  two  separate  functions. 

First,  a  conventional  Kalman  filter  is  implemented  to  obtain  an 
estimate  xj  of  the  state  x^  The  required  filtering  equations,  taken  from 
Eqs.  (A. 2-1)  and  (A. 2-4),  are  as  follows: 


* 


The  superscript  ”0"  denotes  optimal. 
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where  y,  and  Pq  are  the  Initial  mean  and  covariance  matrix  of  the  state, 
and  Xj  and  Pj  denote  respectively  the  estimate  and  its  corresponding 
covariance  matrix  just  before  a  measurement  is  taken. 

To  obtain  the  second  part  of  the  solution  —  i.e.,  the  method  for 
calculating  u°—  we  begin  by  transforming  x^  linearly  according  to  Eq. 

(A.  4 -8)  to  obtain  the  quantity  f  ^ ;  the  latter  is  the  expected  value  of  the 
terminal  state,  given E  (x^)}  =  Xj,  if  no  control  is  applied  during  the  in¬ 
terval  t*  *  t  stjj.  Because  #  is  independent  of  time,  becomes 

(2.2-6) 

where  we  have  made  the  substitution 


Actually  only  the  first  element,  yj.,  of  the  transformed  state  is  needed. 

ms1  T 

If  the  first  row  of  $  is  defined  to  be  a  transposed  column  vector,  <p.  , 

we  have 

7j  »  V?  (2.2-7) 


Similarly,  the  vector  y  in  Eq.  (2.2-1)  is  transformed  according  to  Eq. 
(A. 4-6)  to  obtain  the  quantity  which  represents  the  effect  on  the  ter¬ 
minal  state  of  a  constant  control  u(t)  =  u^  applied  during  the  interval 


-N-i-1 

*  Z 


(2.2-8) 


Only  the  first  element  of  6  ^  is  needed  to  describe  the  effect  of  the  control 
on  the  terminal  miss  distance;  therefore  by  analogy  with  Eq.  (2.2-7)  we 
calculate 


(2.2-9) 
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N-l-1 

where  <gi+^  is  the  first  row  of  ♦  ' .  Now  the  optimal  control  strategy, 

as  proved  in  Appendix  A,  is  to  select  a  control  such  that  the  total  predicted 
terminal  miss  produced  by  both  and  Uj  be  as  close  to  zero  as  possible; 
i.  e.,  we  desire 

h  +  6l  ui  =  0 
Ai  Ai  1 

remembering  that  the  constraint  in  Eq.  (2. 2-4)  must  also  be  satisfied.  Con¬ 
sequently  the  optimal  nonlinear  control  law  (seeEq.(A.4-17))  is  given  by 

hAl 5  D 
K/%1 >  D 

or  alternatively 


(2.2-10) 


Thus  the  complete  guidance  law  is  represented  as  a  linear  filter 
cascaded  with  a  nonlinear  control  policy;  the  latter  consists  of  a  set  of 
gains  d  j  followed  by  an  amplitude  limiter.  A  block  diagram  of  the  system 
is  given  in  Fig.  2.2-1.  We  shall  refer  to  the  entire  sequence  of  the 
optimal  control  as  {u?}. 
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OPTIMAL  NONLINEAR  STOCHASTIC 
GUIDANCE  LAW 


NOISE 

Figure  2.2-1  Optimal  Nonlinear  Stochastic  Guidance  Law 


At  this  point  it  is  worth  mentioning  that  the  guidance  law  derived 
above  is  much  more  general  than  implied  by  the  statement  of  the  guidance 
problem  at  the  beginning  of  this  section,  ft  is  proved  in  Section  A.  4  that 
the  control  sequence  given  in  Eq.  (2. 2-10)  minimizes  any  convex  sym¬ 
metric  function  of  the  terminal  miss  distance  subject  to  the  constraint  in 
Eq.  (2.2-4).  Consequently  one  can  say  that,  in  a  very  broad  sense,  this 
guidance  law  yields  the  best  possible  terminal  accuracy,  within  the  mis¬ 
sile's  control  capability. 

The  mechanization  of  Eqs.  (2.2-5)  and  (2.2-10)  requires  com¬ 
putation  of  both  the  set  of  Kalman  filter  gains  and  the  feedback  gains 
dj,  given  by 


First,  with  respect  to  the  feedback  gains,  both  g .  and  g.  .  are  derived 

N-i  N-i-1  1  1+A 

from  the  matrices  4>  and  4  .  The  latter  usually  can  be  determined 
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N-i 

analytically  in  this  application.  We  have  already  noted  that  ♦  is  the 
transition  matrix  0(tf,tj)  associated  with  Eq.(2.1-7).  Because  F  is  time- 
invariant,  it  follows  that 

♦  (tptj)  =  *(tf-tj,o)  (2.2-11) 

An  analytical  expression  for  4>(t,0)  as  a  function  of  t  can  be  obtained  by 
applying  Laplace  transforms  to  the  homogeneous  equation 

x(t)  =  Fx(t) 

associated  with  Eq.  (2.1-7).  The  result  is 

*(t,0)  =  L-1  £(Is  -  F)’1  j  (2.2-12) 

where 


r* 

1 

s 

1 

2 

1 

cT  (is- 
-mV 

F  ) 
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-1 

s 

s 

s 
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i  V  1 
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-m  V 

my 
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II 

H 

1 

1 

to 

J-o 

0 

0 

(is-Pt)’1 

[0] 

(2.2-13) 

0 

0 

[0] 

(te'Fm 

r 

-1  r  i 

and  L  [  J  denotes  the  inverse  Laplace  transform.  The  vectors  <p^  and  <£i+1 
are  determined  by  evaluating  the  first  row  of  Eqs.  (2.2-12)  and  (2.2-13)  and 
by  substituting  respectively  the  quantities  (tf-tj)  and  (tf  -ti+i)  for  t  into 
Eq.  (2.2-12).  Carrying  out  the  inversion  operation  indicated  in  Eq.  (2.2-12) 
is  straightforward  and  leads  to  fairly  simple  expressions  for  the  elements 
of  and  when  the  dimensions  of  F^  and  Fm  are  not  too  large. 
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The  elements  of  ^  and  vary  with  the  index  i,  requiring 
knowledge  of  time-to-go  which  is  given  by 


assuming  that  the  closing  velocity,  vc,  is  constant,  hi  practice,  vc  is  not 
exactly  constant  so  that  must  be  continually  estimated  from  measure¬ 
ments  of  range  and  range  rate.  Consequently,  in  the  form  presented  here, 
the  optimal  stochastic  guidance  law  is  applicable  only  for  those  missiles 
having  a  radar  homing  sensor,  or  some  other  method  of  measuring  range. 

If  various  simplifications  are  made  —  such  as  modeling  the  noise  as  being 
independent  of  range,  using  constant  filter  gains,  neglecting  autopilot  dy¬ 
namics,  etc. ,  the  requirement  for  range  measurements  can  be  eliminated; 
however  it  is  expected  that  system  performance  will  be  somewhat  degraded. 

The  Kalman  filter  gains  Kj  in  Eq.  (2.2-5)  are  calculated  from  a 
time-varying  nonlinear  difference  equation.  Generally  it  is  most  practical 
to  compute  these  gains  on-line  because  Hj  depends  upon  both  the  closing 
velocity  (see  Eq.  (2.1-12))  and  time-to-go,  which  are  not  known  before  the 
mission. 
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2.3  SUBOPTIMAL  NONLINEAR  STOCHASTIC  GUIDANCE  LAW 


The  optimal  stochastic  guidance  problem  associated  with  Eq. 
(2.1-19)  is  summarized  as  follows: 


Given  the  linear  discrete -time  dynamic  relations 


-i+1  =  *£i+£ui+^i  (2.3-1) 

with  linear  measurements 

z .  =  HjX^  +  v^  (2.3-2) 

determine  the  optimal  sequence  of  controls  fu?} 

(i  =  0. 1, . . . ,  N-l)  which  minimizes  the  per¬ 
formance  index 

(  N“*  ) 

J2  =  Ek^+r£uf|  (2.3-3) 


where  r  is  a  weighting  constant  selected  by  the  designer.  Definitions  of  the 
quantities  4>,  y,  Wj,  and  v^  are  available  in  Eqs.  (2.1-12),  (2.1-14) 
and  (2. 1-15). 


The  solution  to  the  above  problem  can  be  taken  directly  from 
Section  A.  2;  however,  first  it  is  convenient  to  modify  Eq.  (2. 3-1)  using 
the  transformation  technique  described  in  Section  A.  4.  Specifically,  we 
define 


li  -  * 


N-i 


5i 


4 


5i 


(2.3-4) 
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where  ^  is  the  terminal  state  produced  by  an  initial  state  with  no  control 
or  random  forcing  function  applied,  is  the  first  element  of  the  vector 
and  <p7  is  the  first  row  of  the  matrix  Substitution  from  Eq.  (2. 3-1) 

for  xJ+1  and  Xj  produces 

Ii+i  =  +  (2*3-5) 

where  6 .  and  w.  are  specified  by  Eqs.  (A.4-6)  and  (A. 4-7).  Now  the  first 
element  of  the  vector  y^  is  the  terminal  miss  distance  produced  by  the  state 
at  time  t..  Hence,  using  Eq.  (2. 3-4)  we  have 

Yi  =  XiM  (2.3-7) 

aN  1 


Therefore  by  substitution  from  Eqs.  (2. 3-4)  through  (2. 3-7)  into  Eqs.  (2. 3-1) 
through  (2. 3-3),  the  linear  optimal  stochastic  guidance  problem  can  be  re¬ 
stated  as  follows: 


Given  the  linear  discrete-time  dynamic  relations 


51+1  -  *?1+y»1+!?1 


i+1 

*1 


i 


yl,  +  *lV“l 


2i  5i 


with  linear  measurements 


-i =  Hiii+^i 


determine  the  optimal  sequence  of  controls  {u?} 

(i- 0, 1, . . . , N-l)  which  minimizes  the  performance 
index 


(2.3-8) 


(2.3-9) 
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J2  =  E 


k,+r§4 


(2.3-10) 


Although  the  above  problem  statement  is  apparently  more  com¬ 
plex  than  Eqs.  (2.3-1)  through  (2. 3-3),  it  permits  the  solution  for  the 
optimal  controls  to  be  more  readily  obtained.  The  latter  follows  directly 
from  Section  A.  2 .  First  a  conventional  Kalman  filter  is  implemented  to 
obtain  an  estimate  of  yj..  This  is  done  by  first  estimating  using 
Eq.  (2.2-5)  and  then  applying  the  transformation 


yl.  =  «i£i 


(2.3-11) 


This  part  of  the  solution  is  almost  identical  to  that  for  the  nonlinear 
problem  discussed  in  the  preceding  section;  the  only  exception  is  that  u? 
is  now  computed  differently,  as  indicated  below. 

We  can  derive  {u?  }  with  the  aid  of  the  scalar  equation  for 

yi  (Eq.  (2. 3-8))  and  the  performance  index  in  Eq.  (2. 3-10).  Comparing 
i+1 

these  relations  with  Eqs.  (A.  2-1),  (A.2-2),  and  (A.  2-3)  and  making  the 
identifications 


=  1 


=  0 


i  t  N 


=  r 


x.  (in  Eq.(A.2-2)) 


:  :j 


for  all  i 


(2.3-12) 


V 
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we  obtain  the  optimal  control  law 

ui  =  -ci  *i  (2.3-13) 

where  the  scalar  feedback  gain  Cj  is  computed  from  the  backward  recur¬ 
sion  relations 

ci =  si+i  \  (8i+i  6it + r) 

si *  vrci(vi6i.+r) 

sN  =  1  (2.3-14) 

The  control  law  in  Eq.  (2.3-13)  is  similar  to  "predictive  proportional  guid¬ 
ance"  (Ref.  5)  in  the  sense  that  u?  depends  upon  the  predicted  terminal  miss 
distance,  y^.  The  solution  given  here  is  somewhat  more  general  because 
missile  autopilot  dynamics  and  target  dynamics  are  included  in  the  problem 
formulation. 

To  provide  an  analogy  with  the  results  obtained  in  the  preceding 
section,  we  combine  Eqs.  (2.3-11)  and  (2.3-13)  to  obtain 


£ j  -  Cj  pj  (2.3-15) 

The  gains  Cj  are  distinguished  from  the  gains  dj  in  Eq.  (2.2-10)  by  the 
comparison  between  the  scalar  quantities  (1/6^ )  and  e^.  The  latter  is  the 
more  difficult  quantity  to  evaluate  because  no  closed  form  solution  is  avail¬ 
able  for  Eq.  (2.3-14),  whereas  6^  is  obtained  analytically.  Because  the 
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boundary  condition  on  is  specified  at  the  terminal  time  tN,  Eq.  (2. 3-14) 
is  solved  backward  in  time,  and  the  feedback  gain  must  be  stored  in  the 
guidance  computer.  *  This  computational  distinction  is  probably  not  im¬ 
portant  in  applications  where  the  dynamics  of  the  guidance  problem  are 
known  a  priori,  because  the  gains  for  each  guidance  law  can  be  calculated 
off-line  and  approximated  in  storage  as  polynomial  functions  of  time-to-go. 
However,  if  some  important  dynamic  parameters  —  such  as  those  associated 
with  the  missile  airframe  —  are  unknown  and  must  be  identified  on-line, 
then  the  system  gains  must  be  calculated  on-line.  In  the  latter  situation, 
the  computational  advantage  of  the  optimal  nonlinear  law  is  more  significant. 

Thus  far,  the  guidance  law  derived  above  can  be  represented  as 
a  linear  filter  cascaded  with  a  linear  control  policy,  hi  mechanizing  the 
guidance  equations,  the  control  is  first  computed  according  to  Eq.  (2.3-4). 

If  |uf  |  sD,  the  linear  control  is  applied;  however,  if  |u?J  >  D,  the  con¬ 
trol  level  is  ’’clipped”  at  the  level  Dsgn  (u?)  by  the  saturation  inherent  in 
the  control  actuator.  Consequently,  the  actual  applied  control  will  in 
general  be  nonlinear;  it  is  also  suboptimal  with  respect  to  the  objective  of 
minimizing  Jg.  hi  order  to  distinguish  the  applied  control  surface  deflec¬ 
tion  from  that  given  in  Eq.  (2.2-10),  we  designate  the  entire  sequence  of 
controls  generated  by  the  procedure  described  above  as  fuj^}, 


/  T* 

Isfel 

s.  D 

-£i2i  ; 

h  =  ) 

°  (-Dsgn^x.); 

bfel 

>  D 

(2.3-16) 


* 

Perhaps  an  analytical  solution  can  be  obtained  for  the  discrete-time 
feedback  gain  by  making  an  analogy  with  the  continuous-time  case 
treated  in  Ref.  5.  No  attempt  has  been  made  here  to  resolve  this 
question. 
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where  the  subscript  "o"  denotes  that  it  is  a  subop timal  nonlinear  guidance 
law.  The  adjective  "suboptimal”  applies  in  two  contexts  —  because  of  the 
nonlinearity  in  Eq. (2.3-16) {undoes  not  generally  achieve  as  low  a  value 
of  J2  as  the  unconstrained  control;  it  also  does  not  generally  achieve  as 
low  a  value  of  in  Eq.  (2.2-3)  as  the  control  law  given  in  Eq.  (2.2-10). 

A  block  diagram  of  the  above  guidance  law  is  given  in  Fig. 2. 3-1. 
It  is  observed  by  comparison  with  Fig.  2.2-1  that  the  functional  structure 
of  the  suboptimal  law  is  exactly  the  same  as  the  optimal  nonlinear  law 
derived  in  the  previous  section.  The  difference  in  specific  detail  between 
the  two  is,  as  we  have  already  noted,  the  manner  in  which  the  gains  operat¬ 
ing  upon  jL  are  computed.  An  evaluation  of  the  intercept  accuracy  obtained 
using  the  controls  defined  in  Eqs.  (2.2-10)  and  (2.3-16)  is  given  in  the  next 
chapter. 


SUBOPTIMAL  NONLINEAR  STOCHASTIC 
GUIDANCE  LAW 


NOISE 


Figure  2.3-1  Suboptimal  Nonlinear  Stochastic  Guidance  Law 


2-24 


MCI 


j!«gj1|M,i.!Wl.»!WI.^J|J|lll-l.a? 


>,w.nA5WUf*A|W*W-  ■••n'*’  “»»*<*  ' 


THE  ANALYTIC  SCIENCES  CORPORATION 


3.  EVALUATION  OF  GUIDANCE  LAWS 


In  this  chapter  the  results  of  digital  computer  simulations  of  the 
optimal  and  suboptimal  nonlinear  guidance  laws  derived  in  Chapter  2  are 
presented.  In  addition,  a  method  of  limiting  missile  airframe  lateral 
acceleration  is  proposed  and  evaluated.  Statistical  averages  (root- 
mean-square  values)  of  important  quantities  --  terminal  miss  distance, 
peak  acceleration,  etc.  --  are  computed  from  the  results  of  twenty -five 
Monte  Carlo  runs  performed  for  each  of  several  different  launch  times  and 
different  values  of  the  guidance  problem  parameters  (measurement  noise 
level,  target  acceleration  level,  etc. ).  These  averages  are  determined 
empirically,  rather  than  analytically,  because  the  equations  describing 
their  evolution  along  the  missile's  trajectory  are  too  complex*  to  solve  for 
the  number  of  different  cases  which  we  wish  to  examine. 


3.1  CHOICE  OF  MATHEMATICAL  MODELS 

Missile  Dynamics  —  For  this  investigation  the  missile  airframe 
dynamics  are  those  of  a  vehicle  that  utilizes  aerodynamic  lift  for  its  man¬ 
euvering  force  and  has  tail-mounted  control  surfaces  and  fixed  wings.  The 
missile  is  assumed  to  be  in  coasting  (nonthrusting)  flight  with  its  equations 
of  motion  in  the  form  of  Eq.  (2. 1-2),  For  this  type  of  missile,  the  air¬ 
frame  dynamic  parameters  are  specified  by 


$ 

This  is  a  consequence  of  the  fact  that  the  guidance  laws  are  nonlinear. 
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I 

and  the  airframe  state  variables  are 


x  (t)  = 
-m  ' 


w 

a(t)  ' 


(3.1-2) 


The  input  u(t)  is  the  control  surface  deflection  angle.  The  symbols  used 
in  the  above  expressions  are  defined  as  follows: 

M  ,  M  ,  M.,  L  ,L.  =  Stability  derivatives 
a  q  o’  a*  o  * 

V  =  Airspeed 
q(t)  =  Pitch  rate 

a(t) '  =  Normal  acceleration  produced 
by  body -wing  lift 

We  assume  that  all  of  the  above  parameters  are  constant  and  known  and 
that  q(t)  and  a(t) '  can  be  measured  from  rate  gyro  and  accelerometer 
outputs.  This  second  order  model  describes  the  dominant  planar  rotational 
motion  of  the  airframe. 


If  an  accelerometer  is  oriented  along  the  lift  vector  and  mounted  at  the 
missile  center  of  gravity,  its  output,  a(t),  is  related  to  a(t)'  by  the 
relation  (neglecting  measurement  noise): 


a(t)'  =  a(t)  -  VLfi  u(t) 


JJHUtli.  Ji, !»,».» J'^Pf «gff£gyW  1^JJ^^.*^J-  'J  U  '■1|IUP,».^.|.I1-1 
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Target  Dynamics  —  The  target  motion  has  the  random  structure 
specified  by  Eq.  (2. 1-4).  In  this  simulation  the  target  acceleration  at  is 

y 

assumed  to  be  a  first-order  Markov  process  specified  by  the  scalar  quan¬ 
tities 


Ft  =  ft  Ct  =  1 


^  "  2ftCT 


(3.1-3) 


That  is,  satisfies  the  differential  equation 


ft  a.  +  wt(t) 


E{wt(t)wt(r)}  =  2  ft<r2  fi  (t  -  t  ) 


The  covariance  ^  of  the  white  noise  process  which  drives  the  target  dy¬ 
namics  is  expressed  in  terms  of  a,  the  steady  state  root-mean-square 
(rms)  target  acceleration;  i. e. , 


Urn  E  ja,  (t)2}  =  a2 
t-> «  y 

Measurement  Noise  —  The  measurements  available  for  imple¬ 
menting  the  guidance  laws  are  described  by  Eqs.  (2. 1-11)  and  (2.1-12). 

In  this  investigation  it  is  assumed  that  the  missile  autopilot  sensors  directly 
observe  both  state  variables,  i.e., 

H  =  I  (3.1-4) 

at  uniform  intervals  of  length  At.  The  most  important  element  of  the 
measurement  noise  covariance  matrix  is  r^,  the  mean  square  value  of 
the  homing  sensor  noise,  hi  a  practical  application,  homing  sensor  noise 
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is  contributed  by  the  sensor  receiver  unit,  the  target  itself  (scintillation 
noise),  the  target  environment,  and  the  servo  control  loop  used  to  direct 
the  sensor.  Some  noise  components  decrease  as  range  decreases  (e.  g., 
radar  receiver  noise);  others  increase  as  range  decreases  (e.g.,  target 
scintillation  noise);  others  are  range  independent  (e.  g. ,  sensor  servo 
noise).  For  the  purpose  of  providing  a  comparative  evaluation  of  the 
guidance  laws  derived  in  Chapter  2,  we  choose  r^  to  be  constant  along  a 
given  trajectory  with  a  value  that  is  inversely  proportional  to  the  square 
of  the  launch  range,  r  .  This  simulates,  in  part,  the  effect  of  target 
scintillation  noise,  which  is  the  most  troublesome  error  source.  The 
validity  of  this  noise  model  improves  as  the  launch  range  decreases.  The 
expression  from  which  r^  is  calculated  is 

r  -  [  '’a 

rn  ■  Tjm 

The  quantity,  or  /r  (At),  represents  the  standard  deviation  of  the  scintil- 
lation  measurement  error  in  line-of-sight  rate  at  the  instant  of  launch. 
This  error  is  caused  by  the  fact  that  radar  reflections  are  returned  from 
different  points  on  the  target  from  sample  to  sample  because  of  the  tar¬ 
get's  rotation  relative  to  the  missile  and/or  because  of  changes  in  radar 
transmitter  frequency.  The  rms  values  of  the  separation  between  reflect¬ 
ing  points  is  denoted  by  o_.  The  factor  of  two  is  inserted  into  Eq.  (3. 1-5) 

8 

simply  to  allow  for  the  fact  that  scintillation  noise  strength  increases  as 
the  range  to  the  target  decreases.  Thus  r^  represents  an  ’’average"  scin 
filiation  noise  along  the  missile  trajectory.  This  model  provides  a  real¬ 
istic  sensor  noise  level,  neglecting  time- variation  in  the  noise  statistics. 

Some  qualification  is  needed  for  the  assumption  that  the  homing 
sensor  noise  samples  in  Eq.  (2.1-11)  are  independent.  This  is  not 


(3.1-5) 
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realistic  if  the  homing  sensor  is  a  radar  that  operates  at  constant  frequency 
because  then  the  scintillation  effect  is  due  solely  to  changes  in  the  relative 
rotational  orientation  of  the  missile  and  target,  which  usually  occur  more 
slowly  than  the  pulse  repetition  rate.  However,  it  is  often  found  desirable 
to  use  "frequency  diversity"  —  i.e.,  to  change  the  transmitter  frequency 
from  pulse  to  pulse  --  to  frustrate  jamming  countermeasures  taken  by  the 
target.  In  this  case,  successive  scintillation  noise  samples  tend  to  be  inde¬ 
pendent.  The  latter  situation  has  the  more  adverse  effect  upon  guidance 
accuracy.  If  the  error  in  line-of-sight  angle  has  significant  correlation 
over  some  number  of  adjacent  pulses,  the  resulting  error  in  measuring 
LOS  rate  is  less  than  if  the  measurement  errors  are  uncorrelated. 
Therefore,  the  model  used  here  represents  the  worst  type  of  scintillation 
noise. 


Other  sources  of  measurement  noise  are  the  autopilot  sensors 
whose  mean  square  levels  are  denoted  by  r22  (gyro  noise)  and  r33  (accel¬ 
erometer  noise).  These  two  parameters  are  also  assumed  to  be  constant. 
All  three  measurement  errors  are  assumed  to  be  uncorrelated  with  each 
other  so  that  the  off-diagonal  terms  in  R.  are  zero.  Therefore  R.  is  a 
constant  matrix,  R,  of  the  form 


i  =  0,1, . . ,  ,N-1  (3.1-6) 


Li 

Li 
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3.2  SIMULATION  RESULTS 

Iri  this  section,  both  the  optimal  and  suboptimal  nonlinear  control 
sequences  derived  in  Chapter  2,  {u°}  and  {u^)  respectively,  are  eval¬ 
uated  from  computer  simulation  results.  The  values  of  the  parameters 
defined  in  Section  3.1  are  given  below: 


Missile  Airframe  Parameters 


M  = 

-  0.455 

L  =  10.15 

q 

a 

it 

s° 

-  8.4 

L6  =  1.86 

II 

<o 

2 

-71.2 

V  =  2920  ft/sec 

Target  Parameters 

ft  =  -0.3  sec”1 
ct2  =  9.0  x  103(ft/sec2)2 


Measurement  Parameters 
At  =  0.05  sec  v, 

o8  =  4.75  ft  r22 

r  =  10.0  (ft/sec2)2 


2000  ft/sec 
5.0xl0“6(rad/sec)2 
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Initial  State  Statistics 


ix  =  0 


P0  = 


0  0 
0  0 


0 

0 


0  0  9.0xl03(ft/sec2)2 
0  0  0 
0  0  0 


0 

0 

0 

1.0  x  l(f  *(rad/sec)3 


0 

0 

0 

0 


1.0  x  103(ft/sec2)j 


The  missile  airframe  parameter  values  given  above  are  taken 
from  Ref.  6,  Appendix  H.  The  target  dynamics  are  chosen  to  yield  a  tar¬ 
get  acceleration  correlation  time  constant  of  about  three  seconds.  The 
two  upper-left  diagonal  elements  of  the  initial  state  covariance  matrix,  po- 
are  taken  to  be  zero,  simulating  the  absence  of  an  initial  heading  error. 
This  is  done  so  that  the  effects  of  target  acceleration  alone  on  terminal 
miss  distance  can  be  analyzed.  Of  course,  appreciable  heading  errors 
can  exist  at  launch  —  especially  at  close  ranges  —  and  their  presence 
should  be  included  in  a  complete  quantitative  evaluation  of  these  guidance 
laws. 


Another  parameter  to  bo  selected  is  the  weighting  constant  r  that 

is  associated  with  in  Eq.  (2. 3-  3)  and  which  is  needed  to  compute  the 

suboptimal  control  sequence  {u<  },  specified  in  Eq.  (2. 3-16).  The  value 

*o 

chosen  for  r  should  be  such  that  the  comparison  between  { u?}  and  {ujQ} 
is  a  fair  one.  For  example,  if  r  is  large,  the  suboptimal  law  heavily  penal¬ 
izes  the  control  level.  This  tends  to  yield  small  feedback  gains,  in 

Eq.  (2. 3-16)  and  correspondingly  small  values  of  |ui  | ,  at  the  expense  of  a 

o 

relatively  large  terminal  miss  distance.  Thus,  if  terminal  miss  distance 
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is  used  as  the  basis  of  comparison,  the  optimal  guidance  law  will  be 
definitely  superior;  if  the  guidance  laws  are  evaluated  on  the  basis  of  con¬ 
trol  level,  then  the  suboptimal  law  will  appear  to  be  superior.  To  avoid 
this  ambiguity,  we  regard  the  level  of  terminal  miss  distance  as  the  pri¬ 
mary  indicator  of  system  performance;  after  all,  miss  distance  is  the  sole 
quantity  appearing  in  the  performance  index  for  the  optimal  nonlinear  guid¬ 
ance  law  (Eq.  (2.2-3)).  We  shall  also  be  interested  in  the  control  levels 
required  by  both  guidance  laws,  but  this  consideration  will  be  of  secondary 
importance.  With  these  priorities  in  mind,  r  is  chosen  small  enough  so  that 
any  further  reduction  in  its  value  produces  no  significant  further  reduction 
in  the  expected  terminal  miss  distance;  the  value  selected  was  ten. 

The  simulation  consisted  of  substituting  {u°}  and  {in  }  from 

*  *o 

Eqs.  (2.2-10)  and  (2. 3-16)  for  |u.  j  in  Eq.  (2. 2-1),  beginning  at  a  variety 
of  launch  ranges.  Twenty -five  Monte  Carlo  computer  runs  were  made  from 
each  launch  point;  the  random  sequences  {w.}  and  {Vj}  in  Eqs.  (2.2-l)and 
(2.2-2)  were  generated  by  a  Gaussian  rando~'  number  generator.  Because 
we  are  interested  in  the  relative  performance  of  {u?}  and  {u^}, 
identical  sets  of  random  numbers  are  used  in  the  simulation  of  each  guid¬ 
ance  law. 


Figure  3. 2-1  shows  the  performance  of  both  guidance  laws  with 
the  maximum  control  surface  deflection,  D  in  Eq.  (2.2-4),  set  equal  to 
0. 2  radian.  In  Fig.  3. 2-l(a)  the  rms  values  of  the  terminal  miss  distance 
obtained  using  the  optimal  and  suboptimal  guidance  laws  are  plotted  for 
launch  times  ranging  from  one  to  six  seconds  before  intercept.  Note  that 
the  optimal  law  gives  an  accuracy  only  slightly  superior  to  that  of  the  sub¬ 
optimal  law. 
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(a)  Miss  distance 


0  2000  4000  6000  8000  10000  12,000 

RANGE-TO-GQ  AT  LAUNCH(ft) 


(b)  Acceleration 


Figure  3.2-1 


Guidance  Law  Performance  Averagea  Over 
Twenty-Five  Monte  Carlo  Rims:  D  =  0.2  rad 


THE  ANALYTIC  SCIENCES  CORPORATION 
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(c)  Average  control  effort 

Figure  3.2-l(cont.)  Guidance  Law  Performance  Averaged  Over 

Twenty- Five  Monte  Carlo  Runs:  D  =  0.2  rad 


It  has  been  stated  that  terminal  miss  distance  is  of  primary  im¬ 
portance  in  evaluating  the  guidance  laws.  However,  because  the  difference 
between  the  miss  distance  achieved  with  each  law  is  so  small,  other  charac¬ 
teristics  can  be  used  as  a  basis  of  comparison.  Figure  3. 2-l(b)  shows  the 
rms  peak  airframe  lateral  acceleration  in  g's  (lg  =  32.2  ft/sec)  encountered 
along  each  set  of  twenty-five  trajectories.  This  peak  acceleration  usually 
occurs  at,  or  just  before,  the  terminal  time,  when  the  line-of-sight  rate 
becomes  large  because  of  proximity  to  the  target.  Again  there  is  little  dif¬ 
ference  between  the  behavior  of  the  two  guidance  laws.  However,  it  is 
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important  to  note  that  large  lateral  accelerations  —  in  excess  of  30  g's  — 
are  developed;  recall  that  there  is  nothing  in  the  guidance  problem  formu¬ 
lation  presented  in  Chapter  2  which  directly  limits  acceleration.  If  homing 
sensor  measurement  noise  or  target  acceleration  are  significantly  larger 
than  the  values  used  for  this  simulation  (both  possibilities  are  realistic), 
lateral  accelerations  can  be  developed  along  the  trajectory  that  are  beyond 
the  aerodynamic  or  structural  capability  of  the  missile  airframe.*  Conse¬ 
quently  it  will  be  desirable  to  incorporate  some  method  of  bounding  accel¬ 
eration  within  the  control  law;  this  is  the  subject  of  subsequent  sections. 


Another  useful  basis  for  comparing  the  two  guidance  laws  is  the 
amount  of  control  used.  Recall  that  the  subqptimal  law  was  derived  by 
finding  the  sequence  of  controls  which  minimizes  J2  in  Eq.  (2.3-3).  This 
performance  index  differs  from  that  for  the  nonlinear  law  (Eq.  (2. 2-3))  in 
that  it  contains  a  term 


r 


N-l 


which  is  a  measure  of  the  energy  expended  by  some  types  of  missile  actuators 
in  driving  the  control  surface.  To  reflect  this  fact  more  clearly  we  define  the 
control  effort  e  by 


e 


(3.2-1) 


The  average  value  of  e,  denoted  by  e,  evaluated  over  each  set  of  twenty-five 
Monte  Carlo  runs,  is  shown  in  Fig.  3.2-l(c).  Evidently  the  suboptimal  law  is 


$ 

The  missile  airframe  aerodynamic  capability  can  be  exceeded  if  a  lateral 
acceleration  requires  an  angle  of  attack  that  violates  the  linearity  assump¬ 
tions  made  in  writing  the  airframe  equations  of  motion  (see  Eq.  (2. 1-2)); 
the  airframe  structural  capability  is  exceeded  if  the  lateral  acceleration 
developed  by  the  missile  causes  structural  failure  —  e.g.,  if  the  wings 
are  torn  off. 
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much  more  efficient  than  the  optimal  law  in  terms  of  the  required  level  of 
e.  This  might  be  somewhat  surprising  because  the  3uboptimal  law  was 
specified  using  a  value  of  control  weighting  that  can  be  considered  equal  to 
zero  for  all  practical  purposes; that  is,  the  form  of  J2  (with  small  r)  is  ap¬ 
proximately  the  same  as  JjinEq.(2.2-3).  Consequently  one  might  expect  that 
the  suboptimal  law  would  be  nearly  identical  to  the  dKknal  law;  this  turns  out 
to  be  afalse  conjecture.  The  reason  isthatthe  contrejpequence  {uio}inEq. 

(2. 3-16)  is  really  a  suboptimal  mechanization  of  the  Aear  law  in  Eq.(2.3-15). 
The  latter  implicitly  assumes  that  *,ny  value  of  Uj  o^iPbe  realized  at  any  time 
since  no  explicit  constraint  is  imposed  upon  the  control  level.  Fo.?  the 
purpose  of  minimizing  J2,  even  as  r  approaches  zero,  it  is  most  efficient 
to  utilize  large  levels  of  control  only  near  the  end  of  the  trjeetory.  By 
comparison,  each  time  the  optimal  law  computes  a  new  value  of  u°,  it  tries 
to  completely  null  the  predicted  terminal  miss  distance  yj  in  Eq.  (2. 2-7). 

This  tends  to  require  larger  control  levels  than  the  suboptimal  law,  especially 
during  the  initial  portion  of  the  trajectory.  The  differences  between  the 
two  guidance  laws  are  illustrated  in  Figs.  3.2-2  and  3.2-3  where  represen¬ 
tative  gain  histories  and  the  rms  control  level,  III  I  ,  are  shown  for 
»  *  rms 

trajectories  initiated  at  six  seconds  before  intercept. 

In  Fig. 3. 2-2,  the  third  elements,  dig  and  c^,  of  di  and  c,-  respec¬ 
tively  in  Eqs.  (2. 2-10) and  (2. 3-16) are  plotted;  the  relative  behavior  of  these 
two  quantitives  is  characteristics  of  all  the  feedback  gains.  Observe  that  the 
optimal  gain,  dig,  is  much  larger  than  suboptimal  gain  Cjg  near  the  beginning 
of  the  trajectory.  If  the  weighting  constant  r  were  reduced  below  the  value 
ten,  the  effect  on  cjg  would  be  a  noticeable  increase  for  small  values  of  time- 
to-go  but  essentially  no  change  during  the  earlier  portion  of  the  trajectory. 

In  the  presence  of  control  surface  limiting,  the  latter  behavior  has  no  appre¬ 
ciable  effect  on  the  terminal  guidance  accuracy  provided  by  the  suboptimal  law. 

Figure  3.2-3  compares  the  ’:ms  control  levels  (averaged  over 
twenty -ive  Monte  Carlo  runs)  for  both  control  laws.  These  curves  reflect 
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0  2000  4000  6000  8000  10.000  12,000*" 

RANGL-TO-GO  (ft) 

Figure  3. 2-2  Representative  Feedbac.  Gain  Histories  for  Optimal 
and  Suboptimal  Guidance  Laws:  Trajectories 
Beginning  Six  Seconds  Before  Intercept 

the  fact  that  the  feedback  gains  for  the  optimal  law  are  substantially  larger, 
especially  during  the  first  part  of  the  trajectory.  It  is  also  generally  true 
that  the  control  command  frequently  changes  sign  so  that  the  airframe  input 
is  subjected  to  an  input  having  a  ’’bang-bang”  character.  This  may  be  un¬ 
desirable  for  applications  where  the  missile  airframe  has  serious  bending 
modes  that  can  be  excited  by  the  control  switching  action. 

The  fact  that  the  control  levels  are  generally  larger  during  the 
first  part  of  the  missile  trajectory  for  the  optimal  guidance  law  than  they 
are  for  the  suboptimal  law  provides  a  corresponding  difference  in  the  level 
airframe  lateral  acceleration.  This  is  indicated  in  Fig.  3.2-4,  where  rms 
acceleration  histories  beginning  at  six  seconds  before  intercept  are  plotted 
for  both  guidance  laws.  (This  figure  cannot  be  deduced  from  Fir.  3.2-l(b), 
which  shows  only  rms  peak  acceleration).  Again,  there  is  airsu  Adence 
of  the  fact  that  die  optimal  law  works  harder,  earlier,  to  null  the  terminal 
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0  2000  ioi  6000  8000  10.000  12,000 

RANGE-TO  GO  (ft) 


Figure  3.2-3  RMS  Control  Level:  Trajectories  Beginning 
Six  Seconds  Before  Intercept 


l _ i _ i _ _ — i - 1 - 1 - 1— 

0  2000  4000  6000  8000  10000  12000 


RANGE-TO-GO  (ft) 

Figure  3.2-4  RMS  Acceleration  Level:  Trajectories 
Beginning  Six  Seconds  Before  Intercept 
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miss  distance.  Notice  also  that,  as  mentioned  previously,  the  peak  accel¬ 
eration  levels  for  both  laws  occur  at  the  end  of  the  trajectory.  One  con¬ 
sequence  of  these  observations  is  that  the  missile  velocity  losses  caused 
by  induced  drag  will  be  greater  for  the  optimal  law.  This  may  be  an  im¬ 
portant  consideration,  especially  in  long  range  missions  or  for  missiles 
Ixaving  a  low  lift/drag  ratio. 


Mother  interesting  point  to  be  made  here  is  the  comparison 
between  the  linear  steering  law  given  by  Eq.  (2.3-15)  and  the  subqptimal 
nonlinear  law  in  Eq.  (2.3-16).  Suppose  there  actually  were  no  constraint 
on  control  surface  deflection;  then  how  well  would  the  linear  law  perform? 
This  question  can  be  most  readily  answered  by  evaluating  in  Eq. 
(2.3-3),  using  the  expression  in  Eq.  (A.  2-5).  The  latter  becomes 


N-l 

J2  =  VSWg  Si+1  (-i+l^d^i+1 +  6l.Ci-i  ^i-i)  (3,2'2) 


where  the  matrix  Qd  is  obtained  by  substituting  from  Eqs.  (3. 1-3)  and  (2. 1-8) 
into  Eq.  (2. 1-15),  the  matrices  are  obtained  from  the  Kalman  filter  equa¬ 
tions  (Eq.  (2.2-5)),  6j_.  is  given  by  Eq.(A.4-6),  and  c.  and  si  are  determined 
by  Eq.  (2. 3-14),  The  value  of  Jg  is  to  be  compared  with  the  empirically 
determined  average  value  of  Jg,  denoted  by  J2Q,  obtained  by  using  the  non¬ 
linear  control  sequence  [u^  }  given  in  Eq.  (2.3-16); 

=  x«(tfr+r  ]T)  u3  (3.2-3) 

lo  1  i  =  0  \) 


where  the  overbars  denote  averages  over  twenty-five  Monte  Carlo  runs. 

The  values  of  J°  and  J2Q  are  shown  in  Fig.  3. 2-5  for  the  different  launch 
times  used  in  the  simulations  for  Fig.  3.2-1.  Evidently  the  performance 
predicted  by  the  linear  theory  (J^)  is  much  better  than  that  obtained  when 
the  control  surface  deflection  constraint  is  imposed.  This  emphasizes  the 
fact  that  in  order  to  obtain  a  realistic  indication  of  actual  guidance  accuracy, 
simulations  must  be  performed  with  control  level  constraints  included; 
analyses  based  exclusively  upon  linear  theory  tend  to  be  quite  inaccurate. 
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TIME-TO-GO  AT  LAUNCH  (sec) 
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Figure  3.2-5  Values  of  the  Quadratic  Performance  Index 
With  and  Without  Control  Surface  Limiting 


A  number  of  other  simulations  were  performed  with  different 
values  assigned  to  various  parameters  --  such  as  homing  sensor  noise 
level,  rms  target  acceleration,  maximum  control  surface  deflection,  etc. 
The  qualitative  behavior  of  the  data  obtained  is  similar  to  that  shown  in 
Figs.  3.2-1  through  3.2-5.  The  conclusions  reached  thus  fa r  are  sum¬ 
marized  below  for  convenient  reference: 


3-16 


I 


THE  ANALYTIC  SCIENCES  CORPORATION 


•  The  terminal  accuracy  achieved  by  the  optimal 
nonlinear  guidance  law  is  not  substantially  better 
(usually  less  than  ten  percent  better)  than  that 
provided  by  the  suboptimal  technique. 

•  The  suboptimal  law  uses  much  less  control  effort  — 
about  one-tenth  as  much  as  the  optimal  law;  the 
latter  is  bang-bang  in  nature,  a  fact  that  may  be  im¬ 
portant  when  significant  bending  modes  are  present. 

•  The  levels  of  rms  peak  airframe  lateral  accelera¬ 
tion  generated  by  each  method  are  approximately 
the  same;  however  situations  can  occur  where 
potentially  unacceptable  levels  (thirty  to  ninety “1*8) 
occur. 

•  As  pointed  out  in  Chapter  2,  the  gains  (d^)  asso¬ 
ciated  with  the  optimal  law  are  more  easily  derived 
than  those  (c  j,)  for  the  suboptimal  law. 

The  above  conclusions  do  not  establish  a  definite  preference  for 
either  guidance  policy.  A  decision  cannot  be  made  between  the  two  guid¬ 
ance  laws  on  the  basis  of  terminal  accuracy  because  the  optimal  law  is 
only  slightly  better  in  this  respect.  The  optimal  law  is  more  easily  mech¬ 
anized  but,  against  this  advantage  one  must  weigh  the  advantage  that  the 
suboptimal  law  requires  lower  control  levels.  However,  one  important 
question  must  be  resolved  before  a  definitive  judgement  can  be  made  about 
either  guidance  technique.  Namely,  the  simulations  must  account  for  the 


fact  that  the  maneuvering  acceleration  available  is  limited  by  physical 


i.e. ,  a  missile  using  aerodynamic  lift  to  develop  maneuvering  force  — 


some  method  of  preventing  each  guidance  law  from  developing  excessive 
airframe  lateral  acceleration  must  be  provided.  Consequently,  the  re¬ 
mainder  of  this  chapter  is  concerned  with  methods  for  limiting  accelera¬ 
tion.  This  investigation  will  lead  to  more  distinctive  comparisons  between 
guidance  laws. 


3-17 


THE  ANALYTIC  SCIENCES  CORPORATION 


3. 3  A  PREDICTIVE  ACCELERATION  LIMITER 

The  most  straightforward  approach  for  limiting  airframe  lateral 
acceleration  is  to  include  a  constraint  of  the  form 

EKMS  Da 

in  the  design  criteria  for  the  guidance  problem,  where  D  is  the  maximum 

cl 

permissible  acceleration.  This  type  of  "state  variable  constraint"  can 
easily  be  included  within  the  framework  of  the  problem  formulated  in 
Section  A.  3.  However,  the  simplifications  described  in  Section  A. 4  which 
allow  the  optimal  controls  to  be  derived  analytically  cannot  be  applied. 
Consequently  the  control  law  must  be  computed  numerically,  a  task  that  is 
currently  impractical  to  accomplish.  Therefore  we  must  settle  for  some 
other  technique  to  constrain  acceleration. 

Another  approach  that  could  be  taken  is  to  artificially  limit  the 
control  surface  deflection,  u*,  at  some  value  which  physically  tends  to 
prevent  large  accelerations  from  being  generated.  For  example,  guidance 
law  performance  data  are  given  in  Fig.  3 . 3-1  with  the  same  simulation 
parameter  values  used  in  Section  3.2,  except  that  the  value  of  D  is  reduced 
from  0. 2  to  0.1  radian.  Comparing  these  results  with  Figs.  3. 2-1  (a)  and 
(b),  we  find  a  general  increase  in  the  level  of  rms  terminal  miss  distance 
and  a  corresponding  decrease  in  the  rms  peak  acceleration.  However,  this 
method  of  effecting  a  reduction  in  acceleration  tends  to  be  excessively  con¬ 
servative.  Intuitively  it  seems  desirable  to  restrict  (uj  only  when  the  air¬ 
frame  accelera  tion  approaches  the  danger  level;  when  it  is  weH  within  the 
safe  operating  limits,  the  maximum  available  control  surface  deflection 
should  be  allowed.  This  reasoning  leads  us  to  seek  a  more  efficient  method 
for  limiting  acceleration;  one  such  technique  is  presented  in  this  section. 
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Suppose  a  value  of  control  surface  deflection,  is  computed  at 
time  tj  by  either  one  of  the  procedures  given  in  Eqs.  (2.. 2-10)  and  (2. 3-16). 
We  wish  to  determine  whether  to  apply  Uj  to  the  airframe  or  to  take  some 
other  action  that  will  limit  the  buildup  in  lateral  acceleration.  At  time  t 
an  estimate,  x.,  of  the  state  of  the  guidance  system  is  also  available  from 
the  Kalman  filter,  based  on  previous  measurements  and  applied  controls. 

In  order  to  arrive  at  an  appropriate  decision  about  implementing  u^,  the 
predicted  acceleration  level  at  some  specified  future  time,  t  +  r  ,  caused 
by  x .  and  the  control  u^  is  computed  and  compared  against  the  desired 
bound,  D  .  If  the  predicted  acceleration  is  too  large,  u.  should  be  altered 
to  prevent  the  bound  from  being  exceeded.  A  procedure  for  carrying  out 
this  policy  is  described  below. 

For  a  prediction  interval  of  variable  length  t,  the  predicted  state, 
x(ti+t),  satisfies  a  differential  equation  similar  to  Eq.  (2.1-7); 


fm  m 

£  x(ti+r)  =  Fx(tj+T)  +gu(ti+T) 


uftj+T) 


!u^  ;  0  £  r  <  At; 

0  ;  At  s  t 


(3.3-1) 


wher^  u(t)  is  assumed  to  be  zero  after  the  application  of  u..  Therefore 
xfo+r)  is  given  by 


x(ti+T)  =  eFr  x(ti)  + 


VT  FGi+T-X) 

J  e  g  u(X)  dX 

ti 


where  eFr  is  the  matrix  exponential  series  for  Ft;  it  is  also  the  transition 
matrix  $(r, 0)  given  analytically  by  Eqs.  (2.2-12)  and  (2.2-13).  Assuming 
that  r  >  At  and  recognizing  that  u(t.+  r)  =  0  when  r  >  At  we  have 
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At 


2<ti)  +  e-™f  eF<**-V.dA 


(3.3-2) 


Comparing  the  last  equality  in  Eq.  (3. 3-2)  with  Eq.  (2. 1-14)  and  setting 
t  =  t  ,  a  specified  interval,  we  see  that 

Mr 


A  F(Tp-At) 
x(ti)  +  e  y  Uj 


=  4>  (rp,  0)  x(tj)  +  $  (rp  -  At,  0)  yu.  (3 . 3-3) 

iu 

Now  because  missile  acceleration  due  to  lift  is  the  n  state  variable  in 
Eq.  (2. 1-6),  let  tj  be  a  column  vector  formed  from  the  bottom  rcw  of 
$(Tp,  0)  and  let  nQ  be  the  last  element  of  the  vector  $(rp  -  At,  0)  y.  Then 
if  Tp  >  At,  it  follows  that  the  predicted  airframe  lateral  acceleration, 

*m*ti+Tp)’ is  given  by 


m 


(vO  ■  2TIi 


Vi 


(3.3-4) 
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Once  the  prediction  interval  r  is  selected,  t?  and  n  can  be  computed  from 

I*  v 

knowledge  of  both  the  transition  matrix  and  the  vector  y  and  programmed 
into  the  guidance  controller  as  constant  gains,  or  possibly  as  time-varying 
gains  if  the  parameters  defining  the  transition  matrix  vary  in  a  known  manner. 

Having  the  predicted  acceleration,  we  can  determine  whether 
its  magnitude  exceeds  the  specified  bound,  D  .  If  it  does,  the  value 

A 

of  u.  should  be  modified  to  reduce  |a(tj  +  Tp)|,  taking  care  to  ensure  that 
luj  does  not  exceed  the  control  surface  deflection  limit  D.  This  se¬ 
quence  of  logical  tests  is  accomplished  with  the  aid  of  Eq.  (3.3-4)  as 
follows: 


ui 


o 

U. 

1 


Optimal  Law 


,vLj  ;  Suboptimal  Law 
o 


ui 


u. 

1 


Da-Iam<*i+Tp)l  20 

[Da8gn  [“m(ti+TP)]-3T?i]/,’o 

_  or  equivalently  _  }  V  Mi+TP>l  <0 

"i  -{im(ti+TP)  -  Da  s*B[»m<tl+Tp)]K 


(  ;  D-  |u/ 1  *  0 

(  D  sgn(u{);  D  -  lu/|  <  0 


(3.3-5) 


Again  we  note  that  if  the  missile  airframe  dynamics  are  identified 
on  line,  then  the  analytical  expressions  for  tj  and  n  roust  be 
stored  and  evaluated  on-line  as  parameters  are  identified. 
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The  quantity  u/  is  generated  to  reduce  the  level  of  predicted  acceleration 
if  the  latter  is  too  large.  The  resulting  level  of  u/  required  to  correct  the 
acceleration  could  exceed  the  control  surface  deflection  limits;  this  possi¬ 
bility  is  prevented  by  calculating  u/  which  is  the  new  control  surface  de¬ 
flection  command.  The  operations  are  illustrated  in  Fig.  3.3-2. 


t-Sfn* 


LAW  COMPUTER 

Figure  3.3-2  Mechanization  of  the  Predictive 
Acceleration  Limiter 


It  is  emphasized  that  the  method  described  above  for  limiting  air¬ 
frame  lateral  acceleration  is  not  necessarily  an  optimal  procedure,  although 
it  is  a  physically  reasonable  one.  The  choice  of  the  prediction  interval 
is  somewhat  subjective.  If  it  is  too  small,  the  airframe  acceleration  may 
overshoot  the  bound;  if  it  is  too  large,  limiting  action  may  occur  before  it 
is  actually  necessary.  In  these  simulations,  =  0.2  sec  was  found  to 
be  satisfactory. 
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To  implement  the  acceleration  limiter  represented  by  Eq.(3.3-5), 
the  quantities  ~  D  ,  t?  and  tj  --  are  needed.  The  airframe  structural 

<1  O 

limits  dictate  the  value  of  D  ;  77  and  ij  are  specified  by  the  transition 

2i  O 

matrix,  exp  (Ft  ),  according  to  Eqs.  (3.3-3)  and  (3.3-4).  The  computa- 
tion  required  for  the  limiter  is  a  minor  part  of  the  total  needed  to  generate 
the  control  commands. 


3.4  SIMULATION  RESULTS  WITH  ACCELERATION  LIMITING 

In  this  section,  results  are  presented  from  a  number  of  different 
Monte  Carlo  simulations  of  the  guidance  laws  derived  in  Chapter  2,  when 
the  predictive  acceleration  limiter  defined  in  the  preceeding  section  is  used. 
Because  the  limiting  procedure  is  not  necessarily  optimal,  the  control  se¬ 
quences  actually  applied  as  a  result  of  either  guidance  laws  are  probably 
suboptimal.  They  are  derived  by  operating  on  {u? }  and  [uiQ  ]  respectively, 
from  Eqs.  (2. 2-10)  and  (2. 3-16),  with  the  logical  tests  given  in  Eq.  (3. 3-5). 
The  resulting  sequences  are  designated  as  {u°o}  and  {u^  }.  This  notation 
is  suggestive  of  the  fact  that  the  former  is  suboptimal  only  in  its  treatment 
of  acceleration  limiting;  the  latter  is  suboptimal  with  respect  to  both  con¬ 
trol  level  and  acceleration  level  limiting.  For  the  reader's  convenience, 
the  four  guidance  laws  we  have  derived  are  summarized  below: 

{u? }:  Optimal  Nonlinear  Guidance  Law;  minimizes 
1  terminal  miss  distance  subject  to  bounded 
control  level.  No  explicit  acceleration  con¬ 
straint  imposed. 
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{uiQ } ;  Suboptimal  Nonlinear  Guidance  Law;  derived 
using  "linear  quadratic  gaussian  theory"  and 
imposing  the  control  bound  after  the  fact.  No 
explicit  acceleration  constraint  imposed. 

{Ui03;  Suboptimal  Law;  derived  by  operating  on  {u?} 
with  an  acceleration  limiter. 

{uioo3;  Suboptimal  Law;  derived  by  operating  on  {uiQ} 
with  an  acceleration  limiter. 


Each  guidance  law  was  evaluated  for  the  five  cases  given  in 
Table  3.4-1.  Under  each  case,  twenty-five  Monte  Carlo  simulations  were 
performed  for  each  of  nine  cases  having  launch  times  of  one  through  six 
seconds  before  intercept,  the  same  launch  times  used  in  the  simulations 
described  in  preceding  sections.  A  number  of  parameters  were  held  fixed 
for  all  cases;  these  are: 


Control  Weighting: 

RMS  Measurement  Noise: 


Initial  State  Covariance  Matrix:  p^  = 

p22  = 
P44  = 

P55  = 
Pij  = 


Control  Surface  Deflection  Limit:  D  = 

Lateral  Acceleration  Limit:  D„  = 

a 

Mean  Value  of  Initial  State:  jx  = 


10  ft2/ rad2 

5.  Ox  10_6(rad/sec)2 

10(ft/sec2)2 

0;  i/  j 
0 
0 

1.0  x  10"*(rad/sec)2 
1.0xl03(ft/sec2)2 

0  ;  i/j 
0. 2  radian 
20  g's 
0. 
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TABLE  3.4-1 

TRAJECTORY  SIMULATION  PARAMETER  VALUES 


Case  Number  | 

Simulation 

Parameter 

1 

Standard 

2 

Lower  Missile 
Airspeed 

3 

Smaller  Target 
Time  Constant 

4 

Larger  Homing 
Sensor  Noise 

5 

Higher  rms 
Target 
Acceleration 

-  0.455 

-  0.31 

-  0.455 

-  0.455 

-  0.455 

-  8.4 

-  7.05 

xr 

CO 

1 

-  8.4 

-  8.4 

M6 

-71.2 

-47.0 

-71.2 

-71.2 

-71.2 

L 

a 

10.15 

7.27 

10.15 

10,15 

10.15 

L6 

1.86 

2.15 

1.86 

1.86 

1.86 

V 

2920 

1920 

2920 

2920 

2920 

ft 

-  0.3 

-  0.3 

-  0.1 

-  0.3 

-  0.3 

°s 

4.75 

4.75 

4.75 

15.0 

4.75 

p33 

9.0  x 10“ 

9.0  xlO3 

9.0  x 103 

9.0  x  103 

3.6  x  104 

vc 

2000 

1000 

2000 

2000 

2000 

<j2 

9.0  xlO3 

9.0  xlO3 

3 

9.0  x MT 

9.0  x 103 

3. 6  x  104 

At 

0.05 

0.05 

0.05 

0.05 

0.05 

Case  number  1  in  Table  3. 4-1  is  referred  to  as  the  standard; 
its  parameter  values  are  the  same  as  those  used  in  the  simulation  de¬ 
scribed  in  Section  3.2  except  that  acceleration  limiting  has  been  added. 
The  other  four  cases  are  described  relative  to  the  standard  —  e.g.,  lower 
missile  airspeed,  smaller  target  time  constant,  etc. 

The  performance  data  for  case  1  are  displayed  in  Fig.  3.4-1, 
which  is  to  be  compared  with  Fig.  3. 2-1.  hi  general  the  miss  distances 
for  both  guidance  laws  are  now  larger  because  the  missile's  maneuvering 
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capability  has  been  restricted.  However  a  more  significant  observation 
is  that  the  ms  terminal  miss  distance  (see  Fig,  3.4-l(a))  achieved  with 
{u^3  is  significantly  lower  —  as  much  as  thirty  percent  lower  --  than 
that  obtained  with  {u^}.  Thus  when  acceleration  limiting  is  introduced 
into  the  guidance  laws  derived  in  Chapter  2,  the  performance  advantage  of 
the  optimal  law  improves.  Ti  e  ? ;nt  is  even  greater  when  the 

levels  of  measurement  noise  and/or  target  acceleration  are  increased,  as 
will  be  seen  from  the  results  of  the  other  cases  in  Table  3.4-1.  To  under¬ 
stand  why  this  improvement  in  relative  performance  of  the  optimal  non¬ 
linear  law  occurs,  recall  that  when  the  acceleration  limiter  is  absent  each 

new  value  of  u?  attempts  to  null  the  predicted  terminal  miss  calculated  at 
th  ^ 

the  1  stage.  On  the  other  hand,  the  suboptimal  nonlinear  controls 
tend  to  reduce  the  terminal  miss  distance  more  gradually.  These  dif¬ 
ferent  control  actions  cause  the  lateral  acceleration  history  for  {u?}  to 
have  a  larger  magnitude  than  that  for  {u^}  until  near  the  end  of  the  tra¬ 
jectory  (see  Fig.  3.2-4).  For  both  guidance  laws,  the  acceleration  levels 
become  largest  near  the  end  of  the  trajectory,  because  large  accelerations 
are  needed  to  null  miss  distance  when  there  is  little  time  remaining  until 
intercept.  Now,  when  acceleration  limiting  is  introduced,  the  sequence 
[uj^  }  has  an  advantage  over  {uj^}.  The  former,  because  it  is  derived 

from  the  optimal  nonlinear  law,  makes  an  effort  to  null  the  terminal  miss 

% 

early  in  the  trajectory  where  less  lateral  acceleration  is  required  than  if 
it  waits  until  near  the  intercept  point.  Consequently,  as  the  intercept  point 
is  approached  {u^}  has,  on  the  average,  already  significantly  reduced  the 
terminal  miss.  By  comparison,  {uiQ()},  because  it  is  derived  from  the 
suboptimal  nonlinear  law  from  Chapter  2,  does  relatively  little  controlling 
early  in  flight;  therefore  as  the  intercept  point  is  approached  a  relatively 
large  terminal  miss  remains  to  be  nulled.  Consequently,  the  acceleration 
limiter,  which  applies  limiting  action  primarily  near  the  end  of  the 
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trajectory,  tends  to  degrade  the  performance  of  {u^}  less  than  that  of 
{u^}.  These  observations  suggest  that  the  control  sequence  {u^}  may  be 
close  to  optimal  in  the  presence  of  acceleration  limiting.  However,  this 
conjecture  can  be  verified  only  by  actually  determining  the  control  law  that 
minimizes  the  magnitude  of  the  terminal  miss  subject  to  constraints  on 
both  control  and  acceleration  levels,  as  suggested  at  the  beginning  of 
Section  3.3, 

There  is  a  property  of  the  acceleration  limiter  that  can  degrade 
the  performance  of  {u°q}  relative  to  {u^}.  If  the  homing  sensor  mea¬ 
surement  noise  causes  an  inaccurate  estimate  of  the  terminal  miss  dis¬ 
tance  several  seconds  before  intercept,  then  the  corresponding  applied 
control  uj^  may  correct  the  missile's  trajectory  .n  the  wrong  direction. 

If  this  happens,  the  other  control  law  is  preferred  because  it  tends  to  apply 
less  "wrong  control"  early  in  the  trajectory.  Fortunately,  this  effect 
apparently  does  not  occur  sufficiently  often  to  contribute  significantly  to 
the  rms  performance  data  in  Fig.  3. 4-1  (a);  however,  it  can  show  up  in 
individual  trajectories. 

Figure  3.4-l(b),  compared  with  Fig.  3.2-l(b),  indicates  that 
the  limiter  is  successful  in  reducing  the  rms  peak  acceleration  below  the 
bound  of  twenty  g's.  As  expected,  the  limiting  action  causes  the  absolute 
values  of  miss  distance  to  increase  (compare  Figs.  3.4-l(a)  and  3.2-l(a)). 
However,  note  that  the  predictive  limiter  generally  achieves  lov  er  miss 
distances  than  the  technique  of  artificially  reducing  the  bounds  on  control 
surface  deflection  demonstrated  in  Fig.  3.3-1. 

In  order  to  provide  an  additional  indication  of  the  amount  of  con¬ 
trol  required  for  each  guidance  law,  we  define  the  control  variation,  v, 
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(a)  Miss  distance 


0  2000  4000  6000  8000  10000  12,000' 

RANGE-TO-GO  AT  LAUNCH  (ft) 


(b)  Acceleration 

Figure  3.4-1  Performance  Evaluation  of  Suboptimal 
Guidance  Laws  Including  Acceleration 
Limiting:  Case  1 
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TIME-TOGO  AT  LAUNCH  (sec) 


I - 1 - 1 _ i _ i _ i _ i 

0  2000  4000  6000  8000  ’  0,000  12,000 

RANGE-T0  G0  AT  LAUNCH  (ft) 

(d)  Average  control  variation 


Figure  3.4-l(cont. )  Performance  Evaluation  of  Suboptimal 

Guidance  Laws  Including  Acceleration 
Limiting:  Case  1 
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N  -  2 

V  =  |u0|  +  E  lui+l'uil  (3.4-1) 

i  =  0 

In  some  missiles  v  is  a  more  accurate  measure  of  the  actuator  energy 
expenditure  than  is  the  control  effort  e  in  Eq.  (3.2-1),  particularly  when 

w 

an  electrohydraulic  actuator  mechanism  is  employed  which  consumes 
power  only  when  the  control  surface  deflection  is  changing.  In  addition,  in 
some  hydraulic  systems  a  change  in  control  level  is  achieved  by  movement 
of  fluid  which  is  discharged  overboard.  In  the  latter  case,  v  represents 
the  amount  of  fluid  consumed.  The  average  value,  v,  of  the  control  level 
variation,  evaluated  over  each  set  of  twenty-five  Monte  Carlo  runs,  is 
displayed  in  Fig.  3.4-l(d). 

The  data  presented  for  both  e  and  v  in  Figs.  3. 4 -1(c)  and  (d) 
indicate  ihat  the  control  sequence  {u®q}  demands  significantly  more  actu¬ 
ator  energy  than  does  fuioQ},  particularly  for  large  launch  ranges.  This 
observation  is  consistent  with  the  behavior  of  e  in  Fig.  3.2-1.  The  high 
level  of  v  is  a  result  of  the  fact  that  the  control  action  is  quasi  bang-bang 
in  nature;  that  is,  ufQ  tends  to  change  sign  frequently. 

The  levels  of  e  and  v  required  for  {uioQ}  remain  fairly  constant 
as  the  launch  time  increases;  however,  the  curves  for  {u^}  increase  with 
launch  time.  This  behavior  can  be  expected  from  the  gain  histories  c  and 
d  associated  with  each  law  which  are  illustrated  in  Fig.  3. 2-2.  For  long 
trajectories  {ujoQ}  may  be  preferred,  at  least  until  the  missile  is  close 
enough  to  the  target  so  that  sufficient  control  surface  actuation  capability 
remains  to  permit  the  use  of  {u?Q}.  Thus  some  type  of  dual-mode  guid¬ 
ance  laws  discussed  in  this  report  may  be  desirable.  There  are  a  variety 
of  methods  one  could  use  to  accomplish  this;  the  particular  one  selected 
would  be  strongly  dependent  upon  the  type  of  mission  —  i.e.,  long-range 
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or  short-range  —  under  consideration.  This  is  a  subject  which  merits 
further  investigation . 

in  presenting  performance  data  for  the  other  cases  given  in 
Table  3.4-1,  only  the  rms  terminal  miss  distance  and  peak  acceleration 
are  displayed.  In  all  cases  the  behavior  of  v  and  e  is  qualitatively  the 
same  as  in  Fig.  3.4-1. 

Case  2  in  Table  3.4-1  is  characterized,  relative  to  Case  1,  by 
a  lower  missile  airspeed,  V.  This  change  also  affects  other  parameters  — 
namely  the  airframe  stability  derivatives,  which  are  dependent  upon  Mach 
number,  and  the  closing  velocity.  For  the  particular  airframe  data  used 
here,  the  decrease  in  V  alters  the  airframe  dynamics  so  that  the  control 
surface  effectiveness  is  reduced  and  the  distance  between  the  airframe¬ 
wing  center  of  pressure  and  the  missile  center  of  gravity  is  increased. 

This  reduces  the  airframe  capability  to  generate  lift  (lateral  acceleration) 
and  tends  to  cause  an  increase  in  the  miss  distance  compared  with  Case  1. 
On  the  other  hand,  if  we  assume  that  the  closing  velocity  is  reduced  by  the 
same  amount  as  the  airspeed,  then  for  a  given  launch  range,  the  total 
number  of  measurements  taken  over  the  entire  trajectory  increases.  Con¬ 
sequently,  more  averaging  of  measurement  errors  is  performed  by  the 
Kalman  filter  in  the  guidance  controller,  giving  potentially  better  estimates 
of  the  system  state  variables.  Thus  a  decrease  in  closing  velocity  may  tend 
to  reduce  miss  distance. 

For  the  particular  parameter  values  in  Case  2,  the  adverse 
effect  on  miss  distance  produced  by  the  changes  in  airframe  dynamics 
dominates  any  improvement  obtained  with  a  smaller  closing  velocity,  as 
seen  by  comparing  Figs.  3. 4-1  (a)  and  3.4-2(a)  at  the  same  values  of 
launch  range.  The  reduction  in  the  airframe’s  ability  to  generate  lateral 
acceleration  is  evident  from  a  comparison  of  Figs.  3.4-l(b)  and 
3.4-2(b). 
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(b)  Acceleration 


Figure  3.4-2 


Performance  Evaluation  of  Subopt imal 
Guidance  Laws  Including  Acceleration 
Limiting:  Case  2 
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The  effect  of  increasing  the  target  time  constant  (l/fy)  is  illus¬ 
trated  by  Case  S  in  Table  3. 4-1.  This  change  implies  that  the  target 
acceleration  changes  more  slowly  than  in  Case  1.  The  effect  on  the  guid¬ 
ance  system  is  that  the  Kalman  filter  can  track  the  target  acceleration 
more  accurately  because  it  is  more  nearly  constant.  Consequently,  with 
all  other  parameter  values  being  unchanged,  the  miss  distance  for  Case  3 
should  be  somewhat  smaller  than  for  Case  1.  This  expectation  is  veri¬ 
fied  by  comparison  of  Figs.  3. 4- 1(a)  and  3.4-3(a). 

The  effect  of  an  increase  in  the  homing  sensor  measurement 
noise  level  is  demonstrated  by  Case  4  where  the  target  dimension  param¬ 
eter,  cr  ,  is  increased  to  15  feet.  Recall  that  a  represents  the  average 
distance  normal  to  the  line-of-sight  between  reflecting  points  in  the  target. 
The  effect  of  this  change  is  to  increase  the  value  of  r^  by  a  factor  of  ten. 
Evidently,  comparing  Figs.  3. 4-1  (a)  and  3.4-4(a),  much  larger  miss  dis¬ 
tances  are  incurred  for  both  guidance  laws;  however,  the  difference  be¬ 
tween  the  performance  of  {u^}  and  {ui0Q3  increases.  The  former  yields 
art  rms  miss  distance  that  is  as  much  as  fifty  percent  less  than  that  pro¬ 
vided  by  {uj^}.  It  is  also  noted,  comparing  Figs.  3. 4- 1(b)  and  3.4-4(b), 
that  larger  acceleration  levels  are  needed;  these  are  attribited  to  the  in¬ 
creased  rms  error  in  estimating  the  guidance  state  variables. 

Finally,  Case  5  represents  the  effect  of  an  increase  in  rms  tar¬ 
get  acceleration,  o,  from  about  3  g’s  to  about  6  g's.  The  corresponding 
eiiset  on  terminal  miss  distance  and  lateral  acceleration  is  shown  in  Fig. 
3.4-5.  Relative  to  Case  1,  the  miss  distance  increases  for  both  guidance 
laws;  however,  there  is  a  widening  of  difference  between  the  performance 
of  {ufQ}  and  [uj^ ■ .  The  required  lateral  acceleration  levels  also  increase 
for  both  guidance  laws. 
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RANGE-TOGO  AT  LAUNCH  (ft) 


(b)  Acceleration 

Figure  3.4-3  Performance  Evaluation  of  Suboptimal 
Guidance  Laws  Including  Acceleration 
Limiting:  Case  3 
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(b)  Acceleration 


Figure  3.4-4  Performance  Evaluation  of  Suboptimal 
Guidance  Laws  Including  Acceleration 
Limiting:  Case  4 
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(b)  Acceleration 

Figure  3.4-5  Performance  Evaluation  of  Suboptimal 
Guidance  Laws  Including  Acceleration 
Limiting:  Case  5 
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Summary  -  This  section  (3. 4)  has  presented  an  evaluation  of  the 
guidance  laws  derived  in  Chapter  2,  modified  with  the  predictive  accelera¬ 
tion  limiter  described  in  Section  3.3.  In  all  simulations  it  is  found  that 
the  limiter  successfully  maintains  the  rms  acceleration  level  below  the 
prescribed  bound.  In  addition,  the  terminal  miss  distance  achieved  with 
{u^}  is  as  much  as  fifty  percent  less  than  that  produced  by  {u^}.  This 
comparison  substantially  reverses  the  conclusion  in  Section  3.2  that  the 
optimal  nonlinear  guidance  law  (without  acceleration  limiting)  offers  in¬ 
significant  improvement  over  the  suboptimal  law.  The  amount  of  control 
energy,  as  measured  by  either  the  "effort"  e  or  the  "variation"  v  required, 
is  much  greater  for  {u^}  than  it  is  for  {u^};  this  difference  becomes 
more  pronounced  the  further  the  missile  is  from  the  target  at  the  beginning 
of  the  homing  guidance  phase.  Although  we  have  stated  that  terminal  miss 
distance  is  the  primary  basis  for  comparing  guidance  laws,  the  control 
energy  requirements  cannot  be  ignored,  hi  some  applications  it  may  be 
advisable  to  combine  the  advantages  of  each  guidance  law  in  one  dual-mode 
technique. 


The  mathematical  model  used  here  to  evaluate  guidance  laws  is 
sufficiently  realistic  to  indicate  that  the  optimal  guidance  law  derived  in 
Chapter  2  can  offer  substantial  performance  benefits  over  suboptimal  laws  de 
rived  by  minimizing  a  quadratic  performance  index,  when  acceleration 
limiting  is  required.  To  obtain  a  better  knowledge  of  performance  capa¬ 
bility,  homing  sensor  noise  models  with  time-varying  statistics  should  be 
investigated,  hi  addition,  sensitivity  studies  should  be  made  to  determine 
the  amount  of  performance  degradation  caused  by  inaccurate  modeling  of 
the  guidance  dynamics  and  by  the  intentional  use  of  more  simplified  (e.  g. , 
constant  gain)  control  laws.  These  topics  will  be  the  subject  of  further 
study. 
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4,  SUMMARY  AND  CONCLUSIONS 


4.1  SUMMARY 

This  report  is  concerned  with  guidance  laws  for  tactical  missiles 
which  account  for  the  presence  of  random  target  acceleration,  homing  sensor 
measurement  errors,  a  constraint  on  the  maximum  control  level  and  a  con¬ 
straint  on  the  maximum  airframe  lateral  acceleration.  Emphasis  is  placed 
upon  those  techniques  which  can  potentially  be  applied  in  practical  tactical 
missile  weapons  systems  in  the  next  ten  to  twenty  years,  especially  those 
which  can  take  advantage  of  the  rapid  improvement  in  computer  hardware 
technology.  The  particular  missile  considered  for  this  investigation  has  an 
aerodynamically  controlled  airframe  with  fixed  wings  and  tail-mounted  con¬ 
trol  surfaces.  However,  the  conclusions  obtained  here  apply  to  other  types 
of  missiles  as  well. 

In  Chapter  2  the  tactical  missile  guidance  problem  is  formulated 
in  the  context  of  optimal  stochastic  control  theory  and  two  different  guid¬ 
ance  laws  are  derived,  each  being  associated  with  somewhat  different 
problem  formulations.  An  optimal  nonlinear  guidance  law  is  determined 
which  minimizes  the  expected  value  of  die  squared  terminal  miss, 
subject  to  a  constraint  on  the  missile  control  surface  deflection  angle.  This 
law  results  in  the  sequence  of  optimal  control  commands  designated  as 
{u£}.  The  other  guidance  law  is  one  which  minimizes  the  expected  value  of 
a  weighted  sum  of  the  squared  terminal  miss  distance  and  a  quadratic 
penalty  on  the  control;  in  this  case  the  control  level  limit  is  ignored  in 
deriving  the  optimal  control  sequence.  Then  the  latter  is  passed  through 
a  limiter  which  'clips"  the  excess  control  magnitude  resulting  in  a  sub- 
optimal  nonlinear  guidance  law  represented  by  {u^}. 

Simulations  of  both  of  the  above  mentioned  guidance  laws  are 
described  in  Chapter  3.  S  was  found  that  both  laws  tend  to  call  for  large 
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airframe  lateral  accelerations  that  could  be  excessive  in  some  practical 
applications.  Consequently  a  predictive  acceleration  limiter  was  devised 
for  modifying  each  control  sequence  to  prevent  excessive  acceleration 
build-up;  this  procedure  is  described  in  Section  3.3.  The  modified  con¬ 
trol  sequences  are  designated  as  follows: 


o  i 
o 


00 


The  sequences  {u^J  and  {u^}  are  regarded  as  two  additional  suboptimal 
nonlinear  guidance  laws;  they  are  both  suboptimal  because  the  limit  on 
acceleration  has  been  imposed  "after  the  fact, "  rather  than  being  part  of 
the  guidance  law  design  criterion.  To  further  aid  in  distinguishing  the 
guidance  laws,  it  is  useful  to  define  the  following  two  categories  of  se¬ 
quences  of  control  commands: 

Type  1  Laws  Type  n  Laws 

{%}  CuJ} 

The  type  I  laws  are  derived  by  applying  the  theory  of  linear  gaussian 
systems  having  quadratic  performance  indices.  The  control  sequences 
associated  with  the  type  n  laws  are  derived  by  including  an  explicit  con¬ 
straint  on  control  level  in  the  guidance  problem  formulation. 
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4.1  CONCLUSIONS 

i 

The  important  conclusions  deduced  from  computer  simulation 
results  about  the  various  guidance  laws  are  summarized  below: 

•  When  the  acceleration  limiter  is  active  --  i.e., 
when  {u^}  and  {u^}  differ  significantly  from 
{^3  and  {u^}  respectively  —  the  type  n  guid¬ 
ance  law,  fu^3  provides  a  substantially  smaller 
miss  distance  (as  much  as  fifty  percent  smaller) 
than  the  type  I  law,  {u^}. 

•  The  type  n  laws  are  significantly  simpler  to 
mechsmize  than  the  type  I  laws;  the  feedback 
gains  for  the  former  are  derived  analytically 
by  solving  appropriate  algebraic  equations, 
whereas  the  gains  for  the  latter  are  determined 
numerically  by  iteratively  processing  the  re¬ 
quired  matrix  difference  equations. 

•  The  type  H  laws  call  for  average  control  levels 
that  are  mutii  larger  than  those  associated  with 
the  tyjpe  I  laws  —  typically  ten  times  larger. 

Also  the  type  n  laws  are  bang-bang  in  nature, 

a  feet  feat  may  have  an  adverse  effect  where 
significant  body  bending  modes  exist. 

•  The  predictive  acceleration  limiter  success¬ 
fully  provides  fee  desired  control  over  air¬ 
frame  acceleration  level. 

The  observations  that  a  type  n  guidance  law  ({u?  })can  perform  signifi- 


better  and  is  also  more  eas 


the  corre 


type  I  ({uju})  law  are  important  developments  from  this  research. 


The  above  conclusions  provide  several  criteria  for  selecting  a 
guidance  law  for  a  particular  application.  If  fee  lowest  possible  terminal 
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miss  distance  is  desired,  without  regard  for  the  amount  of  control  energy 
expended,  the  type  n  laws  are  preferred.  The  most  significant  perform¬ 
ance  advantage  of  the  latter  exists  in  those  applications  where  terminal 
guidance  accuracy  is  significantly  limited  by  the  maximum  allowable  (or 
achievable)  missile  maneuvering  acceleration  level.  If  control  energy 
consumption  must  be  lower  than  that  associated  with  the  type  n  laws,  the 
type  I  laws  can  be  used.  A  third  possibility  which  combines  the  best  of 
both  techniques  is  a  dual  r  >de  procedure,  using  a  type  I  law  to  conserve 
control  energy  far  from  the  target  and  a  type  n  law  at  short  range  to 
reduce  terminal  miss. 

Some  additional  comments  about  the  potential  computational  re¬ 
quirements  of  the  guidance  system  are  in  order.  Both  types  of  laws  re¬ 
quire  the  same  time-varying  linear  state  estimator  —  a  Kalman  filter. 
For  the  planar  motion  problem  considered  in  this  report,  a  five-state 
filter  was  employed  to  estimate  the  state  variables  associated  with  the 
target,  the  missile  translational  motion  relative  to  the  target,  and  the 
autopilot.  In  an  actual  application  it  is  likely  that  a  simpler  three-state 
filter  —  obtained  by  assuming  that  autopilot  state  variable  measurement 
errors  are  negligible  —  would  yield  satisfactory  operation.  In  addition 
to  a  filter,  each  guidance  law  uses  a  set  of  time-varying  feedback  gains 
to  generate  the  feedback  commands.  We  have  pointed  out  the  fact  that  the 
gains  for  the  type  n  laws  are  more  easily  calculated.  In  circumstances 
where  the  dynamics  defining  the  guidance  problem  are  known  a  priori, 
the  feedback  gains  required  for  all  guidance  laws  can  be  calculated  prior 
to  flight.  Thus  each  gain  can  be  approximated  as  a  simple  polynomial  and 
stored  in  the  guidance  computer.  However  if  some  important  dynamic 
parameters  —  such  as  those  associated  with  the  missile  airframe  —  are 
unknown  and  must  be  identified  on-line,  then  the  system  gains  must  be 
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calculated  on-line .  In  the  latter  situation,  the  computational  advantage  of 
the  nonlinear -type  laws  is  quite  significant. 

It  is  likely  that  a  number  of  simplifications  to  the  above  guid¬ 
ance  laws  could  be  made  without  seriously  degrading  performance. 

For  example,  less  frequent  updating  of  filter  and  feedback  gains  can  be 
tried;  in  some  cases  constant  gains  may  be  adequate.  It  may  also  be 
desirable  to  predesign  the  missile  autopilot  (making  it  adaptive  if  necessary) 
and  then  neglect  autopilot  dynamics  in  formulating  the  guidance  problem; 
this  is  found  to  be  a  reasonable  procedure  in  the  simpler  applications  treated 
in  Ref.  6.  The  investigation  of  such  modifications  is  a  logical  extension  of 
the  work  reported  here. 


4.3  TOPICS  FOR  FUTURE  RESEARCH 

The  investigation  described  here  includes  many  effects  that  are 
actually  encounteredin  a  tactical  missile  engagement  —  measurement  noise, 
target  maneuvers,  bcr  ^ded  controls,  bounded  acceleration,  airframe  dynam¬ 
ics,  etc.  The  stuoy  ^aicates  that  guidance  laws  which  are  derived  tak¬ 
ing  into  account  bounded  missile  control  level  can  offer  significant  per¬ 
formance  advantages  when  realistic  levels  of  measurement  noise  and  tar¬ 
get  acceleration  exist.  However  a  more  complete  evaluation  of  the  guid¬ 
ance  laws  is  needed,  including  the  following  tasks: 


•  Analysis  of  the  sensitivity  of  the  stochastic  guid¬ 
ance  laws  described  above  to  errors  in  modeling 
sensor  noise,  target  motion,  and  missile  auto¬ 
pilot  dynamics  should  be  performed.  The  objective 
is  to  determine  which  guidance  law  is  least  affected 
by  imperfect  knowledge  of  the  guidance  equations  of 
motion. 
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•  A  detailed  comparative  evaluation  of  various  guidance 
laws  —  including,  classical  methods  such  as  pro¬ 
portional  guidance,  pursuit  guidance,  etc.  operating 
in  the  presence  of  homing  sensor  noise,  a  maneu¬ 
vering  target,  and  control  surface  limiting  —  should 
be  carried  out  to  provide  performance  curves  that  are 
useful  for  system  specification. 


The  outcome  of  such  a  study  would  be  indications  of  the  ultimate  performance 
that  can  be  achieved  in  a  given  tactical  situation  with  a  given  missile  design. 

Another  topic  of  interest  is  the  derivation  of  guidance  laws  which 
account  for  intelligent  target  maneuvers.  This  is  motivated  by  the  possi¬ 
bility  that  the  enemy  target  may  know  what  type  of  guidance  law  is  being 
used  against  him;  therefore  he  may  be  able  to  employ  a  rational  evasion 
technique  that  will  achieve  larger  miss  distances  than  if  he  used  random 
maneuvers.  Investigation  of  this  problem  through  the  use  of  differential 
game  theory  is  recommended. 


I 
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APPENDIX  A 

OPTIMAL  STOCHASTIC  CONTROL  OF  LINEAR  SYSTEMS 


The  subject  of  optimal  stochastic  control  is  concerned  with 
determining  control  policies  which  optimize  some  probabilistic  measure 
of  performance  for  stochastic  dynamical  systems.  The  techniques  studied 
in  this  report  apply  only  to  systems  with  equations  of  motion  that  are 
linear  in  both  the  state  and  the  open  loop  control  variables  *  and  with  obser¬ 
vations  that  are  also  linear  combinations  of  the  state  variabies.  For  this 
special  case  the  theory  of  stochastic  control  is  fairly  complete  for  both 
continuous  and  discrete  time  systems  and  has  been  extensively  documented. 
This  appendix  summarizes  the  main  results  for  discrete  time  systems  with 
appropriate  references  to  the  literature,  omitting  those  mathematical 
proofs  that  are  readily  available  elsewhere. 

It  should  also  be  mentioned  that  a  large  body  of  theory  exists 
for  stochastic  control  systems  having  nonlinear  equations  of  motion, 
especially  for  discrete  systems  (e.g.,  Refs.  9  and  10).  However,  few 
results  are  available  that  lead  to  practical  control  laws. 


A.l  PROBLEM  FORMULATION 

A  linear  continuous  stochastic  dynamic  system  is  defined  by  dif¬ 
ferential  equations  of  the  form 

x(t)  =  F(t)  x(t)  +  G(t)  u(t)  +  w(t)  (A.  1-1) 

5 

That  is,  Eq.  (A.  1-1)  is  linear  in  x(t)  and  u(t),  regardless  of  how  u(t) 
may  be  generated.  For  example,  u(t)  may  be  a  nonlinear  function  of  x(t). 
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where  F(t)  and  G(t)  are  known  time-varying  matrices,  x(t)  is  the  r- 
dimensional  state  vector,  u(t)  is  an  m-dimensional  set  of  control  variables, 
and  w(t)  is  an  n-dimensional  random  disturbance  input  to  the  system.  It  is 
usually  assumed  that  w(t)  is  a  Gaussian  random  process,  also  called 
"process  noise,"  having  the  following  statistical  characteristics:* 

Mean  Value:  Ejw(t'j  =  0 

Covariance  Matrix:  E|w(t)  w(r)^|.  =  Q(t)  6(t-r)  (A.  1-2) 

The  quantity  6(t-r)  is  the  unit  impulse  (Dirac  delta)  function  and  Q(t)  is  a 
known  positive  semidefinite  matrix.  Although  Eq.  (A.  1-2)  specifies  that 
w(t)  has  zero  mean,  a  nonzero  mean  can  readily  be  included  when  it  is 
known.. 


To  complete  the  specification  of  the  system  dynamics  described 
by  E*.  (A.  1-1),  the  initial  state  must  be  provided.  We  assume  that  x(t{)) 
is  a  vector  gauss ian  random  variable  having  known  mean  and  covariance 
matrix  given  by 

A 

=  M 

=  P0  (A.  1-3) 

The  above  model  defines  w(t)  as  a  "white  noise"  random  process 
which  has  the  property  that  sample  values  taken  at  different  time  instants 
are  uncorrelated.  If  the  latter  condition  does  not  hold  for  the  system  under 
investigation,  Eq.  (A.  1-1)  can  often  be  modified  to  obtain  a  valid 


E 


E{x(t0)} 
|  *tto>  -  m]  [x(tQ  -  m]  | 


* 


E  {  }  denotes  mathematical  expectation. 
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mathematical  model  that  does  have  white  process  noise.  The  assumption 
that  the  procejs  noise  is  gaussian  is  often  physically  reasonable  because 
many  random  disturbances  can  be  accurately  modeled  as  a  superposition 
of  a  large  number  of  statistically  independent  random  events.  The  result¬ 
ing  aggregate  process  has  a  probability  distribution  that  approaches  the 
gaussian  distribution  as  the  number  of  constituent  events  becomes  large, 
regardless  of  the  probability  distributions  of  the  individual  events.  In 
addition,  gaussian  processes  have  the  desirable  mathematical  property 
that  their  gaussian  character  is  preserved  when  they  are  "passed  through” 
a  linear  system,  such  as  the  one  represented  by  Eq.  (A.  1-1).  That  is,  if 
w(t)  and  x(tQ)  are  gaussian  and  u(t)  is  a  known  function  of  time  on  an  inter¬ 
val  to  *  t  £  tj,  then  x(tj)  is  also  gaussian;  the  mean  value  of  x(tj)  is  deter¬ 
mined  by  the  initial  mean,  jx,  and  the  known  history  of  u(t). 

In  order  that  a  feedback  control  policy  can  be  mechanized,  mea¬ 
surements  related  to  the  state  vector  x(t)  must  be  available.  In  a  physical 
system  such  measurements  are  typically  obtained  by  sensors  whose  outputs 
are  observations  of  known  functions  of  the  state  variables  corrupted  by 
measurement  errors.  Furthermore,  measurements  frequently  can  be  ob¬ 
served  or  accepted  only  at  discrete  time  instants,  t^(i  =  0, 1, . . . ), either 
because  the  sensor  inherently  operates  as  a  sampler  or  because  a  digital 
computer  is  used  to  process  the  measurement  data.  For  missile  appli¬ 
cations  the  above  conditions  can  usually  be  expressed  by  the  linear  mea¬ 
surement  equation, 

z.  =  HjXj+v.;  i  =  0, 1, ...  (A.  1-4) 


For  example,  the  time-correlated  process  can  often  be  regarded  as 
the  output  of  a  linear  system  driven  by  white  noise  and  the  state  vari¬ 
ables  associated  with  the  latter  are  included  in  the  definition  of  x(t). 

This  is  a  paraphase  of  the  "central  limit  theorem"  (see  Ref.  11). 
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In  Eq.  (A.  1-4),  z .  is  a  q-dimensional  measurement  vector,  Hj  is  a  known 
matrix  and  {vj}  is  a  gaussian  white  noise*  sequence  specified  by 


E 


{  v.  }  =  -  for  ail  i 
Rj;  i  =  j 

0;  it) 


(A. 1-5) 


where  Rj  is  a  positive  definite  matrix.  The  noise  sequence  accounts  for 
the  measurement  errors  and  Hi  accommodates  all  situations  where  the 
measurements  are  linear  functions  of  the  state  variables.  The  noise  co- 
variance  matrix  R^  is  assumed  to  be  positive  definite.  It  is  also  usually 
reasonable  to  assume  that  the  process  w(t)  is  uncorrelated  with  the  se¬ 
quence  {v^};  however  the  subsequent  discussion  can  easily  be  modified  if 
this  condition  does  not  hold  (see  Ref.  12). 


An  optimal  feedback  control  law  is  to  be  selected  for  u(t)  in 

Eq.  (A.  1-1)  so  that  an  appropriate  performance  index  J  is  minimized.  We 

will  allow  u(t)  to  be  a  function  of  all  measurements  that  have  been  taken  up 

to  time  t.  Just  as  in  the  case  of  the  measurements,  it  is  usually  true  that 

new  values  of  the  control  can  be  computed  only  at  discrete  instants  of  time 

because  of  the  data  processing  requirements.  Consequently  we  assume 

here  that  u(t)  is  to  be  held  constant  at  the  value  u .  over  the  interval 

*  ** 

tj  s  t  <  t.+^  where  ti  is  coincident  with  the  measurement  time  ;  therefore 
* 

For  the  case  where  the  measurement  noise  is  known  to  be 
correlated  in  time  see  Ref.  12. 

In  some  situations  it  may  be  possible  to  process  measurements  faster 
than  control  changes  can  be  computed;  in  other  cases  some  measurements 
e.  g.,  a  gyro  output)  may  be  obtainable  more  frequently  thanxithers  e.  g.,  a 
homing  sensor  output).  These  possibilities  can  readily  be  included  in  the 
theory;  however,  for  convenience  of  exposition,  the  measurement  and  con¬ 
trol  computation  times  are  considered  to  be  identical  here. 
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the  problem  of  obtaining  u(t)  reduces  to  that  of  determining  the  sequence  fuj} 
The  latter  consideration  motivates  the  following  choice  for  the  form  of  the 
performance  index.  * 


J  [x  • }  |f  ^  +  S  “i’ 


(A.  1-6) 


where  tQ  and  are  specified  initial  and  final  times.  In  addition  there  may 
be  a  constraint  on  {uj}  of  the  form** 

Uj  €  0.  (A.  1-7) 


where  Oj  is  a  specified  region  in  pa-dimensional  euclidean  space.  For 
example,  if  we  require  |u.  |  *  1  for  all  i,  then  0^  is  an  m -dimensional 
box  with  sides  two  units  long.  The  expectation  operation  is  required  in 
Eq.  (A.  1-6)  because  the  index  can  be  minimized  only  in  a  statistical  sense; 
the  actual  value  of  the  quantity  inside  the  brackets  cannot  be  evaluated  be¬ 
cause  x(t)  is  a  random  process. 

Because  the  performance  index  depends  upon  the  state  and  con¬ 
trol  variables  at  discrete  instants  of  time,  differential  equations  (Eqs. 

(A.  1-1))  are  not  required  to  describe  the  dynamic  behavior  of  the  system. 
Instead,  difference  equations  —  derived  from  the  differential  equations  -- 
which  relate  the  value  of  the  state  at  time  tj  to  its  value  at  time  t.+j  are 
sufficient.  The  latter  are  readily  obtained  from  Eqs.  (A.  1-1)  and  (A.  1-2) 


* 

The  notation  E  {  }  means  that  the  mathematical  expectation  is  to 

[^3 

be  carried  out  over  all  the  random  variables,  x(tj)  —  i= 0, 1, . . . ,  N  — 
appearing  within  the  braces. 

afe  sfc 

The  symbol  c  means  "is  contained  in." 
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using  the  properties  of  the  solutions  of  linear  stochastic  differential  equa¬ 
tions  (see  Ref.  12  or  13);  these  expressions  together  with  the  other  rela¬ 
tions  needed  for  the  optimization  problem  are  summarized  below: 


Discrete  Time  Optimal  Stochastic  Control  Problem  —  Determine 
the  optimal  piecewise  constant  feedback  control  policy  as  a  function  of  all 
past  measurements* 


u(t)  = 

H°(Vi); 

t.  s  t  s  t, 

1  ] 

zi  = 

; 

0  S  j  S  i 

1+1 


which  minimizes  the  performance  index 


(A. 1-8) 


(A. 1-9) 


for  a  specified  value  of  t,,,  subject  to  the  discrete  time  constraint  equations 


x.,.  =  #.x.  +  r  .u.  +  w. 

-i+i  i-i  -l-i  — i 


z  .  =  H.x.  +  v. 

-l  i-i  -l 


Hi  ‘  °i 


e{x0}  =  h;  Ej(x0-i.)(50-i.jr|  =  P0  (A.l-10) 

The  matrices  ^  and  I\  in  Eq.  (A.  1-10)  are  related  to  the  parameters  in 
Eqs.  (A.  1-1)  and  (A.  1-2)  by 


T - 

The  superscript  ”0”  denotes  an  optimal  control.  The  symbol  Zj  denotes 
the  sequence  of  all  measurements  observed  up  through  time  tj. 


l 


( 
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*i  -  ‘(‘wO 

♦(fa)  =  F(t)  ;  *(ti,t1)  =  I  (A.  1-11) 


ri  =  r+1*(‘i+l'T)G(r)dT 


(A.  1-12) 


The  sequences  {wj}  and  {v^}  are  gaussian  white  noise  sequences  satisfy¬ 
ing  the  conditions, 


fa} =  E fa} =  ^ 


T 

E  w.  w. 


k'  y  *(ti+i>T)Q(T)*(ti+i>T)TdT; 


0  ;  i  /  j 


E  favj 


;  i  =  j 


o  ;  i  t  j 


(A.  1-13) 


•  The  above  discrete  time  formulation  is  used  throughout  this  report. 
An  analogous  development  for  continuous  systems  is  available  in  the  cited 
references. 
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A.  2  QUADRATIC  PERFORMANCE  INDICES  AND 
UNCONSTRAINED  CONTROL  VARIABLES 

A  particularly  important  case  of  the  discrete  time  optimal 
control  problem  formulated  in  the  preceding  section  occurs  when  the  per¬ 
formance  index  in  Eq.  (A.  1-9)  is  a  quadratic  function  of  the  state  and 
control;  i.e., 

j  =  {£)  l  ^VnSn  +  i?o^‘Vi5i+2^Wi-1]}  (A_2'1) 

where  Vj  —  i  =  0, 1, . . .  ,N  —  are  positive  semidefinite  matrices  and 
Wj  —  i  =  0, 1, . . .  ,N-1  —  are  positive  definite  matrices.  In  addition,  we 
assume  the  control  variables  are  unconstrained;  i.e.,  Oj  in  Eq.  (A.  1-10) 
is  the  entire  m-dimensional  euclidean  space.  This  type  of  performance 
index  is  often  chosen  when  the  objective  is  to  reduce  the  magnitude  of  the 
state  without  using  excessive  amounts  of  control.  It  tends  to  limit 
energy  expenditure  and  it  also  tends  to  limit  the  magnitude  of  the  required 
control  level,  although  it  does  not  explicitly  bound  the  latter.  Perhaps  a 
more  important  reason  for  its  popularity  is  that  the  optimal  feedback  con¬ 
trol  is  linear  and  readily  computed,  *  as  demonstrated  below. 

Because  the  dynamics  and  measurements  in  Eq.  (A.  1-10)  are 
linear,  the  optimal  control  sequence  {u?}  that  minimizes  J  can  be  deter¬ 
mined  analytically  (Ref.  12)  in  the  form: 


* 

This  is  a  relative  judgement;  it  is  readily  computed  compared  with 
solutions  to  many  more  general  control  problems. 
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where 


Hj  “  -CjXi 


Hi  “  Ii  +  Ki(5i-Hi2i) 


51+l  =  *iji+ri“°;  H0  ■  t 


Ci  *  (rXl  ri  +  Wi)‘X  rISi+l  *i 

Si  *  •TVlVc7(wi+rTV1rl)ci+vi 


SN  ‘  VN 


(A.  2-2) 


(A.  2-3) 


■  5i  HI  (Wi +  Ri)’ 


*iPi*i+Qi’  po  =  po 


=  P.  -  K.  (h.P.H^+R.)  kT 

1  1X111  1/1 


(A.  2-4) 


The  associated  minimum  value  of  the  performance  index  in  Eq.  (A.  2-1), 
J;  is  given  by* 


J°  =  Tr 


{so(v^T)  +U  sw  (s + ricipi  *d}  <a-2-5> 


The  notation  Tr  {  }  denotes  the  trace  of  the  matrix  (sum  of  its 
diagonal  elements)  within  the  braces. 
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The  quantity  x.  in  Eq.  (A.  2-2)  is  the  conditional  mean  of  the 

gaussian  vector  x.,  given  knowledge  of  the  control  and  measurement 
1  * 

histories  up  through  time  t.  .  The  conditional  mean  has  the  property  that 
it  is  the  best  possible  estimate  of  x^  under  a  wide  variety  of  estimation 
criteria?*  Equations  (A.  2-2)  and  (A.  2-4)  comjtitute  a  discrete  Kalman 
filter  which  recursively  calculates  xj  in  terms  of  the  known  controls  {u?}, 
the  system  dynamics,  the  random  process  statistics,  and  the  measurements. 
The  matrix  IC-  is  referred  to  as  the  Kalman  gain.  '• 

The  gain  matrix  Ci  determined  from  Eq.  (A. 2-3)  is  identical  to 
that  associated  with  the  optimal  control  law  that  minimizes  the  deterministic 
performance  index. 


-ntn5n 


W,T 


-!  Wi-i) 


assuming  the  process  and  measurement  noise  sequences  in  Eq.  (A.  1-10) 
are  absent.  Consequently  the  optimal  stochastic  controller  is  mechanized 
by  two  distinct  operations  —  an  optimal  linear  estimator  (Kalman  filter) 
that  is  independent  of  the  performance  index  weighting  matrices,  and  an 
optimal  feedback  control  law  whose  gains  (Eq.  (A.  2-3))  are  independent  of 
the  random  process  statistics.  This  is  the  so-called  separation  property 
for  linear  systems  with  quadratic  performance  indices  and  unconstrained 
controls;  the  structure  of  the  controller  is  illustrated  in  Fig.  A.  2-1. 

♦ 

The  conditional  mean  includes  the  effect  on  the  state  in  Eq.  (A.  1-10) 
of  a  known  control  input  as  well  as  the  unknown  initial  conditions  and 
the  unknown  process  noise. 

For  example,  Xj  is  the  value  of  x .  which  minimizes 

E|(5i-fi)TA(51-51)| 
where  A  is  any  positive  semidefinite  matrix. 
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Figure  A.  2-1  Structure  of  the  Optimal  Stochastic  Controller  for  a 
Linear  Plant  with  a  Quadratic  Performance  Index 

The  linear  control  law  described  above  is  relatively  simple  to 
implement.  When  the  parameters  defining  the  matrices  T\,  etc.  in 

*  IX 

Eqs.  (A.  2-2)  through  (A.  2-4)  are  known  apriori,  the  gains  C.  and  FL  can 
be  precomputed  and  stored  in  a  computer  so  that  the  only  on-line  calcula¬ 
tions  required  are  those  specified  in  Eq.  (A.  2-2).  However,  the  assump¬ 
tions  of  a  quadratic  performance  index  and  an  unconstrained  control  level 
are  unrealistic  for  some  problems.  In  the  next  section  more  general  de¬ 
sign  criteria  are  considered. 


The  parameters  of  a  tactical  missile  guidance  problem  are  not  always  known 
apriori;  this  point  is  discussed  in  Section  2.1. 

Often  simple  polynomial  approximations  to  the  gain  histories 
are  adequate. 
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A. 3  GENERAL  PERFORMANCE  INDICES  AND 
CONSTRAINED  CONTROL  VARIABLES 

This  section  outlines  the  method  of  solving  the  discrete  optimal 
control  problem  posed  in  Section  A.  2  for  a  linear  system  with  the  perform¬ 
ance  index 


(A. 3-1) 


The  derivation  presented  here  uses  a  dynamic  programming  approach,  as 
in  Ref.  14.  We  first  assume  time  tN  has  occurred  and  the  complete  optimal 
control  ^sequence  {u°}has  been  applied  using  the  observed  measurements 
{z .}.  Then  proceeding  backward  from  stage  to  stage,  we  determine  the 
individual  commands  u^_£,  •  •  •  needed  to  minimize  the  ’’cost  to  com¬ 

plete  the  process"  JNl,  jn-2’  *  *  *  defined  by 


E 

{Xj} 

N-jsisN-1 


In  this  way  we  can  derive  a  recursive  expression  from  which  each  optimal 
control  command  can  be  derived  in  terms  of  the  control  commands  and 
measurements  that  preceded  it  in  time,  providing  a  feedback  control  law. 
hi  the  general  case  this  recursive  relation  is  difficult  to  solve,  requiring 
extensive  numerical  calculation;  however  we  shall  see  that  the  guidance 
problem  posed  in  Chapter  2  can  be  solved  analytically.  To  aid  the  discussion 
it  is  convenient  to  use  the  notation 

;  •  * 1  *  * 

Zk  A  {zj  ;  0  *  i  *  k 
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{  * 
i 

i. 


1.1 


i  i 


U 


That  is,  U°  and  Z^  denote  the  sequences  of  l11  optimal  controls  that  have 
been  applied  and  all  measurements  that  have  been  observed  up  through 
time 


To  begin,  we  assume  that  all  stages  of  the  process  have  been 
completed  and  we  are  at  time  t^.  There  are  no  more  optimal  controls  to 
be  computed  and  the  optimal  value  of  the  terminal  cost,  J°,  is  defined  by 


(A. 3-2) 


which  is  shorthand  notation  for  the  expectation  operation 


JN  I***/  !(-n)  P^-N  !  UN-l?ZNjd^N  (A. 3-3) 

The  quantity  p(x^|  U°  Z^)  is  the  conditional  probability  density  func¬ 
tion*  of  xN,  as  determined  by  the  known  measurement  and  control  se¬ 
quences,  and  dx^  denotes  the  n -dimensional  ’’volume"  element  dx^(tjj) 
dx2(tN).  •  •  d^(t]sj).  The  limits  of  integration  on  each  integral  in  Eq. 

(A.  3-3)  are  from  minus  infinity  to  plus  infinity;  for  convenience,  they  are 
omitted  from  the  notation  throughout  this  discussion,  always  being  under¬ 
stood  as  infinite.  Now  at  time  t^,  x^  is  a  random  variable  whose  mean 
value  is  a  function  of  the  control  history,  the  observed  measurement  his¬ 
tory,  the  statistics  of  the  initial  state  x^  in  Eq.  (A.  1-3),  and  the  statistics 


* 

Throughout  this  report,  a  function  of  the  form  p(x[y)  denotes  the  con¬ 
ditional  probability  density  of  x,  depending  upon  a  known  value  of  y.  This 
is  an  abuse  of  conventional  notation  which  uses  the  symbol,  px|y  (f]  rj), 
where  £  and  v  denote  particular  values  of  x  and  y,  respectively."  The 
shortened  notation  used  here  should  be  clear  to  the  reader,  keeping  in 
mind  the  fact  that  p(y|x)  and  p(x|y)  are  two  different  functions  of  the 
same  variables,  x  and  y;  that  is,  p(y|  x)  ^  p(x|  y)  I 


y  x,  x  -*y 
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of  the  measurement  and  process  noise  sequences.  *  In  addition  the  fol¬ 
lowing  important  conditions  are  satisfied  at  time  t^: 

•  The  random  sequences  {w.}  and  {vj  in  Eq.  (A.  1-10) 
are  gaussian. 

•  The  control  and  measurement  histories,  U?.  1  and 

Z%„  are  known.  iN“1 

N  - 

•  Both  the  process  dynamics  and  the  measurements 
(Eq.  (A.  1-10))  are  linear  functions  of  the  state. 

Therefore,  is  ” conditionally  gaussian;”  i.e.,p(xN|UF  ^ Z^)  is  a 
gaussian  function  that  is  completely  specified  by  the  conditional  mean  ^ 
and  covariance  matrix  P^.  of  xN,  defined  by 


-N  f*  *  j  -N  p(-n|  UN-1’  Zn) 

PN  =  /*•*/(- N“-n)(-N’-n)  n|UN-1’ ZN^d— N 


The  functional  form  of  p(x^|U^_j  Z^),  expressed  in  terms  of  and 


PN,  is  given  by 


P(-N  !UN-1  Zn)  =  2” 


n/2 


P 


N 


■1/2 


exp  j-  2  (xN-  xK)  PF  (xN  -  x N)j 


* 

As  in  Section  A.  2,  the  mean  of  the  state  at  any  time  t  contains  a 
component  produced  by  the  known  control  sequence  applied  up 
through  time  t  as  well  as  a  component  derived  from  previously 
observed  measurement  data. 
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Now  because  of  the  three  conditions  listed  immediately  above,  can  be 
determined  by  a  Kalman  filter  acting  «n  the  measurement  data  Z^  and  known 
controls  ^  as  specified  by  Eqs.  (A. 2-2)  and  (A.  2-4);  furthermore, 
is  a  deterministic  quantity  (i.e.,  it  is  independent  of  the  measurements) 
provided  by  Eq.  (A.  2-4)  at  stage  i  =  N.  Therefore,  substituting  the  above 
expression  for  p(x^  |U°_^,  Z^)  into  Eq.  (A.  3-3),  J°  can  be  calculated  as 
a  function  of  x.T; 

JN  =  f“  (*N)  (A.  3-4) 

The  over-bar  notation  refers  to  the  averaging  operation  performed  in 
Eq.  (A.  3-3).  The  significant  property  of  Eq.  (A.  3-4)  is  that  the  depend¬ 
ence  of  J°  on  all  random  variables  can  be  expressed  solely  in  terms  of 
xN  and  a  number  of  deterministic  quantities  such  as  covariance  matrices, 
plant  dynamics,  etc.  The  latter  are  suppressed  in  the  notation. 

Next  suppose  that  all  but  the  last  step  of  the  process  defined  by 
Eq.  (A.  1-10)  has  been  completed  using  the  collections  of  known  optimal 
controls  and  measurements,  c£_g  and  ZN_^.  Therefore  referring  to 
Eq.  (A.  3-1),  at  time  t^  we  need  to  determine  the  value  of  uNm1>u° 
which  minimizes  the  cost  to  complete  the  process, 

JN-1  =  E  |f(-N)+^N-l(-N-l,-N-l,tN-l 
-N-1’-NV 

This  minimization  is  indicated  by  defining  the  optimal  cost  to  complete 
the  process  by* 


* 

The  notation  min  {  }  means  that  the  value  of  a  within  the  set  A  is 
acA 

to  be  determined  which  minimises  the  quantity  inside  the  braces. 
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^N-l€°N-l(^N-l,5N 


+  Iun-2»zn-?}| 


(A.  3 -5) 


where  the  notation  --  |  U°  _2,  Z  j  —  emphasizes  that  the  known  controls 
and  measurements  are  used  to  define  the  conditional  probability  density 
functions  for  and  x^. 

Equation  (A.  3-5)  can  be  investigated  in  two  parts.  By  applying 
the  same  argument  used  for  Eq.  (A.  3-2)  and  assuming  for  the  moment 
that  the  minimization  over  uN_^  has  been  carried  out  so  that  uN^  =  Hjj-i 
is  known,  the  first  expectation  in  Eq.  (A.  3-5)  can  be  written  as 


L&-l,SN-l’tN-l) "  E  {LN-l(^-l,5N-l,tN-l)|UN-2,ZN-l}(A’3'6) 

because  1  is  independent  of  xN>  Just  as  we  found  at  the  Nth  measure¬ 
ment  time,  x,T  „  is  a  gaussian  random  variable  and  its  conditional  mean 
— N-l 

xN  j  is  provided  by  a  Kalman  filter  operating  on  the  measurement  history 
ZN  jand  the  control  history  U^_2  (note  that  xN_x  is  independent  of  u^). 
Continuing  with  our  assumption  that  is  known,  the  second  expectation 
in  Eq.  (A. 3-5)  can  be  expressed  as  follows: 
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The  sequence  of  equalities  in  Eq,  (A.  3-7)*  holds  because  x^  can  be  regarded 
as  a  function  of  the  control  u^_^  (assumed  known  as  a  function  of  ^); 
therefore  each  term  is  an  equivalent  expression  for  the  expectation  condi¬ 
tioned  on  uj_2>  311(1  ZN-1*  Now  comP?lre  E(l*  (A*3-5)  with  Eqs. 

(A.  3-6)  and  (A.  3-7),  where  we  have  assumed  throughout  that  is  known, 
and  observe  that 


L  (— N-l*  -N-l’^N-l)  + 


(A. 3-8) 


Using  the  identity 


P(x|£)  =  J...Jp(x|£,  z)p(z  |y)  dz  =  E  |p(x|jr,  z)  |  yj  (A. 3-9) 

for  probability  density  functions  of  the  random  vectors  x  and  z,  given  a 
known  value  of  y,  Eq.  (A.  3-7)  can  be  expanded  as  follows 


**N  E  jf  (— n)  |EN-2’  — N-l*  ZN-l| 

-N 

”/*  °  °  Jf^zE  {P(^r|UN-2,^N-l,ZN)|UN-2,-N-l,ZN-l|d-N 
— N  ; 


=  E  <  E 

-n(-n 


{f  (-n)  |UN-2’-N-l’  Zn| 


TjO  O  yj  i 

UN-2’  -N-1’  *N-1* 


(A.  3-10) 


Compare  Eq.  (A.  3-7)  with  Eq.  (A.  3-3)  and  note  that  Jjj  is  different 
from  J°  because  is  conditioned  on  the  measurements  only  up  to 
time  tj£_i  whereas  is  conditioned  on  the  measurements  up  to 
time  tjj. 
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A  comparison  of  the  last  line  of  Eq.  (A.  3-7)  with  Eqs.  (A.  3-2)  and  (A.  3-4) 
reveals  that 


(A. 3-11) 


Substitution  of  Eq.  (A.  3-11)  into  (A.  3-8)  produces 

JN-1  =  ^N-l(-N-l’  -N-l*  *N-l)  +  E  |Jn(-n)|UN-2’  -N-l’  ZN-l| 

-N  1  ' 

(A. 3-12) 

Now  recall  our  assumption  that  the  optimal  control  u^  ^  at  stage 
N-l  is  known;  however,  in  fact  our  objective  is  to  determine  it  from  the 
functional  form  of  This  is  accomplished  by  rewriting  Eq.  (A.  3-12) 

in  the  equivalent  form 

JN-1  "  UN-l(5N-l’  -N-l'  Vl)  +  E  IJN&n)  |UN-2’  -N-l'  ZN-l}( 

-N-i c  °n-i  ) 

(A.  3-13) 


and  by  carrying  out  the  indicated  minimization  over  uN_^.  Equation 
(A.  3-13)  has  the  important  property  that  the  optimal  cost  to  complete  the 
process  at  stage  N-l  can  be  expressed  in  terms  of  the  "incremental  cost" 
Ln  j  and  the  optimal  terminal  cost  which  we  computed  at  stage  N 
(see  Eqs.  (A.  3-2)  through  (A.  3-4)).  This  has  the  recursive  form  which 
we  desire;  now  we  proceed  to  show  that  the  dependence  of  H^-l  upon  Pre” 
vious  controls  and  measurements  can  be  expressed  solely  in  terms  of 

-N-l* 

In  order  to  determine  u,T  *  we  must  be  able  to  compute  the 

— N-l 

averages  defined  in  Eq.  (A.  3-13).  This  can  be  done  using  the  properties 
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of  Eqs.  (A.  3-6)  and  (A.  3-11).  From  Eq.  (A.  3-6)  we  have 
C(-N-l’  —N-l*  ^-l)  "{• '  S  hi-li-N-V  -N-r  tN-l)p(-N-l  |^N“2*  ZN-l)  ^N-l 

_  S*“$  LN-l(-N-l’  -N-l’  *N-l)  6XP  l"I  (-N-1  ~-N-l)TpN-l(-N-l~-N-l)| ^N-l 


(2v)n/‘  7Det(PN_i; 


(A.  3-14) 


where  x^_^  and  are  provided  by  the  Kalman  filter  equations  — 

Eqs.  (A.  2-2)  and  (A.  2-4)  —  operating  on  the  measurement  and  control 
histories,  ZN and  U^_g.  To  compute  the  second  expectation  in  Eq. 

(A.  3-13),  recall  from  Eq.  (A.  2-2)  that  the  Kalman  filter  output  at  stage 
N  is  given  by 

-N  =  ^N-l-N-l  +  rN-l-N-l  +KN  _-N”HN  (*N-J  -N-l+  rN-l  -N-l)] 

(A. 3-15) 

Jh  addition,  using  Eq.  (A.  1-10)  note  that  the  measurement  zN  can  be 
written  as 


-N  ^  N-l— N-l  +  rN-l  -N-l  +-N-l)  +  -N-l 


(A.  3-16) 


Now  regarding  u  as  a  set  of  parameters  to  be  determined  and  knowing 
that  wN_j,  and  vN_^  are  all  independent  gaussian  random  variables, 

it  follows  that  z  ^  is  a  gaussian  random  variable  whose  mean  rind  covariance 
can  be  derived  directly  from  Eq.  (A.  3-16); 

E{-n}  -  ?.N  =  hn(*N-1-N-1+IN-1-N-i) 

e{(-n“-n)^n"-n)  }  =  sn  hn(*n-ipn-i  *N-l+QN-l)HN  +  ^-1 

(A. 3-17) 
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If  we  define 

iN  =  ?.N-zN.  (A.  3-18) 

and  substitute  Eqs.  (A.3-16),  (A.3-17),  and  (A. 3-18)  into  Eq.  (A.3-15) 
the  result  is 

-N  =  $N-l~N-l  +  rN-l-N-l+KN^-N  (A.  3-19) 

where  the  so-called  measurement  residual,  is  a  zero  mean  gaussian 
random  variable  having  covariance  equal  to  in  Eq.  (A.  3-17).  There¬ 
fore  the  second  expectation  in  Eq.  (A.  3-13)  can  be  written  as 

E  |jn(*n)  |UN-2’  -n-1’  zn-i}  = 

— N  ^  ’ 

A  ~  KNEN*^ 

x  =  KNiN  (A.  3-20) 

Equations  (A. 3-13),  (A. 3-14),  (A.3-17),  and  (A. 3-20)  provide 

all  the  relations  needed  to  calculate  the  optimum  control  uXT  ,  as  a  func- 

o  — N-l 

ticn  of  xXT  i  conditioned  on  UXT  „  and  ZXT  , .  Furthermore  it  is  clear  that 
— N-l  —  r^jZ  N-l 

the  dependence  of  JN_^  upon  and  ZN_^  can  also  be  expressed  com¬ 
pletely  in  terms  of  just  as  J°  depends  only  upon  x^  in  Eq.(A.3-4). 

Therefore,  by  induction  it  can  be  established  that  Eqs.  (A.  3-13)  through 
(A. 3-20)  hold  if  the  index  N  is  changed  as  follows: 

N  -*  j;  j  =  N,  N-l, . . .  ,1 
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Consequently  by  recursively  carrying  out  the  minimization  specified  in 
Eq.  (A.  3-13),  the  entire  set  of  optimal  controls  can  be  generated  with 
each  control  u?  being  a  function  of  x,.  This  method  of  solving  the  con¬ 
trol  problem  is  referred  to  as  dynamic  programming.* 

Notice  that  a  form  of  separation  principle  holds  for  the  above 
optimal  control  law  in  the  sense  that  u?  is  always  a  function  of  the  condi¬ 
tional  mean  (optimum  estimate)  of  the  state.  The  latter  is  provided  by  a 
Kalman  filter  in  the  same  fashion  described  for  quadratic  performance 
indices  in  Section  A.  2.  However,  the  optimal  controls  are  not  generally 
linear  functions  of  the  estimated  state  nor  are  they  independent  of  the  sta¬ 
tistics  of  the  random  processes  as  evidenced  by  Eqs.  (A.  3-14)  through 
(A.  3-20).  The  mechanization  of  the  optimal  controller  is  illustrated  in 
Fig.  A.  3-1;  this  should  be  compared  with  the  linear  system  in  Fig.  A.  2-1. 


{Vi},  twi}.{*i}.{r, },{«,},  {Oi},{Hi}  {«} 


Figure  A.  3-1  Structure  of  the  Optimal  Stochastic  Controller  for 
a  Linear  Plant  with  a  General  Performance  Index 


* 

See  Ref.  15  for  a  discussion  of  dynamic  programming  applied  to 
stochastic  control  problems. 
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Except  in  special  cases,  the  optimal  controls  determined  from 
Eq.  (A.  3-13)  cannot  be  determined  analytically;  often  the  dynamic  pro¬ 
gramming  equations  must  be  solved  numerically  and  the  optimal  control 
variables  stored  as  functions  of  the  estimated  state.  A  great  deal  of  com¬ 
putational  effort  and  computer  storage  may  be  required  to  accomplish  this 
task  because  of  the  multidimensional  integrals  in  Eqs.  (A.  3-14)  and  (A.  3-20) 
that  must  be  evaluated.  Some  simplification  can  be  gained  by  expressing 
these  integrals  as  solutions  to  the  corresponding  multidimensional  diffu¬ 
sion  partial  differential  equation  (Ref.  14).  However,  the  amount  of  com¬ 
putation  required  for  more  than  two  variables  of  integration  is  still  formid¬ 
able.  Consequently,  in  formulating  an  optimal  control  problem  for  a  linear 
system  with  nonquadratic  performance  indices  and/or  constrained  control 
variables,  an  effort  must  be  made  to  limit  the  number  of  integrations  re¬ 
quired  in  Eqs.  (A.  3-14)  and  (A.  3-20).  Situations  where  this  objective  can 
be  achieved  are  described  in  Section  A.  4. 


A. 4  COMPUTATIONAL  CONSIDERATIONS 

In  the  preceding  section  it  is  pointed  out  that  there  are  large 
computational  requirements  associated  with  solving  the  optimal  stochastic 
control  problem  for  a  linear  system  with  an  arbitrary  performance  index. 
To  minimize  the  amount  of  computation,  the  dimensionality  of  the  integrals 
in  Eqs.  (A.  3-14)  and  (A.  3-20)  must  be  kept  as  low  as  possible.  This  can 
be  accomplished  by  choosing  a  performance  index  that  depends  on  as  few 
variables  as  possible  at  each  sta&e  of  the  backward  recursion  in  Eq. 

(A.  3-13).  To  illustrate,  suppose  that  L|  =  0  for  all  i  in  Eq.  (A.  2-5)  and 


■  e|'<5n){  ■  EjfWtN))| 


(A. 4-1) 
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where  x^ttf)  denotes  the  first  element  of  xN.  That  is,  the  index  is  a 
function  of  only  one  state  variable  at  the  terminal  time  and  is  independent 
of  the  state  and  control  variables  at  other  times.  In  addition,  specialize 
Eq.  (A.  1-10)  to  the  case  where  the  control  input  is  a  scalar, 

-i+1  =  ®i-i  ui+  -i  (A. 4-2) 


and  impose  the  constraint 


|u.|  s  D;  for  alii  (A. 4-3) 

where  D  is  a  specified  constant.  The  above  design  criteria  are  realistic 
for  the  two  dimensional  guidance  problem  for  a  tactical  missile,  where 
Xj^n)  corresponds  to  the  terminal  miss  distance  and  Uj  is  the  control  sur¬ 
face  deflection. 

In  order  to  exploit  the  form  of  Eq.  (A.  4-1),  it  is  necessary  to 
define  a  new  state  vector  y.  by  the  linear  transformation 

-i  =  HvO-i 

♦(Vi) =  1 

The  matrix  $(t^,t.)  is  the  transition  matrix  from  time  t  to  time  asso¬ 
ciated  with  Eq.  (A.  4-2)  and  determined  by* 

*(W i )  =  TT  4>  (A.  4-5) 

"  j  =N-1  1 

~  Eq 

The  notation  means  the  product,  4>n$n  ^ 
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Therefore  y^  is  the  value  of  the  state  at  the  terminal  time  produced  by  an 
initial  condition  x .  at  time  tj  with  the  control  and  process  noise  in  Eq. 

(A. 4-2)  equal  to  zero.  Substitution  for  x.  and  x .  -  from  Eq.  (A. 4-4)  into 
Eq.  (A.  4-2)  produces 

i1+1  =  Zi  +_\  V“i 

-i  '  *(tN’ tl+l)-i  (A. 4-6) 

where  w .  is  a  zero-mean  gaussian  vector  random  variable  with  covariance 
matrix 


■MI  i  ■  ♦OjpVi)  Qi*(vti+i)T  'a-4-’) 

and  Qi  is  defined  in  Eq.  (A.  1-13).  Because  the  transformation  in  Eq. 

(A. 4-4)  is  nonsingular,  *  Eq.  (A.  4-6)  is  equivalent  to  Eq.  (A. 4-2)  in  des¬ 
cribing  the  system  dynamics . 


The  important  thing  about  Eq.  (A.  4-6)  is  that  the  first  element 
y1(ti+i)  of  is  independent  of  the  last  n-1  elements  of  y ..  Therefore, 
if  the  performance  index  in  Eq.  (A. 4-1)  is  expressed  in  terms  of  the  vari¬ 
ables  yj,  the  integration  in  Eq.  (A.  3-20)  is  performed  over  only  one  state 
variable.  To  prove  this  assertion  it  is  convenient  to  use  the  fact  that  the 
optimum  (Kalman  filter)  estimates  of  x  and  y  are  also  related  by  Eq. 

(A. 4-4)  (see  Ref.  13); 


A 


(A. 4-8) 


* 

The  discrete  time  transition  matrix  is  always  nonsingular  when  A.  in  Eq. 
(A.  4-2)  is  derived  by  discretizing  a  continuous  time  system. 
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Applying  Eq.  (A. 4-8)  to  Eqs.  (A. 3-15)  through  (A, 3-19)  it  follows  that 


h  =  i±-i  +ii-i  ui-i +  *  (V>  ‘i)  “iii  <A-4-9> 


where  £.  is  given  by  Eqs.  (A. 3-15)  and  (A.  3-16).  The  dynamics  of  the 
estimate  of  the  first  element,  y-^,  of  y  ^  are  therefore  given  by  the  scalar 
equation 


A  A 


(A. 4-10) 


where  a.  is  a  zero  mean  gaussian  random  variable  and  6-, 

1  Ai-1 

element  of  6.  . .  The  mean  square  value  of  is  given  by 


is  the  first 


E 


2 

cr. 


aT*WKi,!iKiT*(V»)T*i 


!p[IO..«] 


(A. 4 -11) 


where  is  defined  by  Eq.  (A.  3-15)  with  N  -  i.  Observe  that  the  only  ele¬ 
ment  of  appearing  in  Eq.  (A.  4-10)  is  yj^  ^ 

We  also  know  that 

3  =  E  jf(xN)|  =  E  jffeN)(  (A* 4-12) 

because  “  5  n  accor^inS to  E(l*  (A.  4-4).  Combining  Eqs.  (A. 4-1), 

(A. 4-10)  and  (A.  4-12)  with  Eqs.  (A.  3-13)  and  (A.  3-20),  letting  N  -  i  and 
setting  =  0  in  Eq.  (A.  3-13)  produces 
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(A. 4-13) 

Thus  we  have  reduced  the  problem  of  finding  the  optimal  control  to  that  of 
solving  Eq.  (A.  4-13)  recursively,  a  task  that  requires  averaging  over  only 
one  variable,  a.,  as  opposed  to  averaging  over  n  variables,  X,  in  Eq. 

(A.  3-20). 


Further  simplification  can  be  made  to  Eq.  (A.  4-13)  if  the  index 
J  1ms  certain  properties.  In  particular,  suppose  J°(t)  n  a  convex,* 

**  A  *  1 

even  function  of  its  argument,  r  =  yj.,  as  defined  by: 

Even  Property:  J^t)  =  J?(-t) 


Convexity: 


for  all  Tj  and  Tg 


Now  make  the  definition 


(A. 4-14) 

(A. 4-15) 


2|e 

The  results  obtained  here  will  hold  under  more  general  assumptions; 
however  convexity  is  a  sufficiently  broad  condition  for  our  purpose. 


** 


Convex  even  functions  are  a  very  broad  class;  some  examples  are 
,2,  It  I ,  t4,  and  etTL 
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so  that  Eq.  (A.4-13)  can  be  rewritten  as 


min 

Vl 

lui-l'  *  D 


(A. 4-16) 


If  the  properties  in  Eq.  (A.  4 -14)  hold  for  J°(t)  then  they  also  hold  for 
J?(p)  with  respect  to  the  variable  p.  The  even  property  can  be  demon¬ 
strated  by  substituting  -p  for  p  in  Eq.  (A. 4-1 6)  and  changing  the  variable 
of  integration  from  a  to  X  according  to 


-p  +  a  =  -p  -X 

The  convexity  property  is  established  by  writing 

7s>i)  =  t==~  fJi(pi+“) exp 

v  (T* 


a2 

2^ 


da 


J? 


‘  JJi?(p2  +  °)  exp 

a/  2  IT  Oj 

*»|V 

1 

_ 1 

1  +p2\ 

1  f,o/Pl+P2  ,  \ 

cn 

2  7 

exp 

da 


2 

a 


da 


and  using  the  fact,  taken  from  Eq.  (A.  4-14),  that 


2J°(V“)  lJi1p2+°)  s  Ji(JT1  +  0) 
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Consequently  the  minimum  value  of  J?(p)  in  Eq.  (A.  4-16)  is  achieved  with 
the  value  u°  of  u.  that  minimizes  |  o !  subject  to  the  constraint  on  the  con¬ 
trol  level,  as  demonstrated  by  Fig.  A. 4-1.  This  value  of  u^  is  easily  ob¬ 
tained  from  Eqs.  (A. 4-3)  and  (A. 4-15)  in  the  following  form: 


A 


£  D 


>  D 


(A. 4-17) 


0  •  4  ft  6 


Figure  A.  4-1  A  Graphical  Illustration  of  the  Fact 

that  Minimizing  |p|  also  Minimizes 
any  Convex  Even  Function  of  p. 


From  Eq.  (A. 4-17)  we  see  that  ^  is  an  odd  function  of 

y-i 

1-1 
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and  therefore  so  is  p(y<  ,  u.  J  |  _  „0  . 

•*■1-1  Uj_j  - 

optimal  cost  in  Eq.  (A.  4-1 6),  given  by 


This  implies  that  the 


is  an  even  convex  function  of  y^.  that  is,  it  has  the  same  properties 
which  we  assumed  for  J 9{r)  .  Consequently,  if  f(xj(tN))in  Eq.  (A.4-1) 
is  a  convex  even  function,  so  are  the  functions  J°(yi-)  for  i  =  0, 1, . . . ,  N 
and  therefore  Eq.  (A.  4-17)  holds  for  all  values  of  i.  This  result  is  sig¬ 
nificant  for  mechanizing  the  optimal  control  law  because  each  u°  is  given 
analytically  in  terms  of  the  optimal  estimate  of  the  transformed  state 
variable  yj  .  There  is  no  need  for  carrying  out  the  integration  in  Eq. 

(A.  4 -13)  unless  the  actual  value  of  J?  is  desired  for  the  purpose  of  evaluat¬ 
ing  performance. 


The  above  special  case  has  been  developed  in  detail  here  be¬ 
cause  it  has  application  for  tactical  missile  guidance  systems.  The  dis 
cussion  also  demonstrates  some  systematic  procedures  —  i.  e. ,  state 
transformation  and  the  use  of  convex,  even  cost  functions  —  that  can 
greatly  simplify  the  problem  solution. 
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